Deloitte Senior Data Engineer - Hux in San Jose, California
Are you a talented software engineer with outstanding Python skills? Passionate about creating innovative Data Pipelines from scratch to solve real-world problems using machine learning? Deloitte Digital is seeking creative minds and persistent problem solvers to build and support cutting-edge marketing AI Platform that help our clients deliver meaningful, personalized journeys to delight their customers at every touchpoint.
At Deloitte Digital, we build state-of-the-art marketing Platform that give our clients a single view of their customers' brand interactions across all channels (in-store, online, call centers), and enable our clients to deliver tailored marketing offers that maximize customer happiness and value.
We're growing fast and need brilliant Data Engineers like you to fuel our continuing innovation and growth. Along the way, you'll find exceptional growth opportunities limited only by your hunger for learning and applying new technologies in our exciting, start-up-like environment.
Work you'll do
As a Senior Data Engineer for Machine Learning, you'll design, implement and maintain a full suite of real-time and batch jobs which fuel our cutting edge AI, providing real-time marketing intelligence to our clients.
You'll develop, test and deliver production grade code to help our clients solve their most challenging marketing problems using cutting edge big data tools. You'll also ensure data integrity, resolve production issues, and assist in the support and maintenance of our overall Platform.
With a platform that can ingest, load and process billions of data points, you'll enjoy new challenges and opportunities to showcase your development skills on project teams to build innovative new-client platforms and execute highly visible strategic development projects.
Your responsibilities will include:
Design, construct, install, test and maintain highly scalable data pipelines with state-of-the-art monitoring and logging practices.
Bring together large, complex and sparse data sets to meet functional and non-functional business requirements.
Design and implement data tools alongside data scientist team members to help them in building, optimizing and tuning our product.
Integrate new data management technologies and software engineering tools into existing structures.
Help in building high-performance algorithms, prototypes, predictive models and proof of concepts.
Use a variety of languages, tools and frameworks to marry data and systems together.
Recommend ways to improve data reliability, efficiency and quality.
Collaborate with project data scientists and consultants on meeting project goals, and with our broader engineering team to improve and develop our platform and tooling.
Tackle challenges and to solve complex problems on a daily basis.
You'll join a team of passionate, talented Data Engineers who collaborate to design, build and maintain cutting-edge AI solutions that arm our clients with real-time customer insights delivering tremendous value. If you're intellectually curious, hardworking and solution-oriented, you'll fit right into our fast-paced, collaborative environment.
You'll work daily with our project teams and closely with Data Science and DevOps Teams to deliver production level grade pipelines that will run unattended for weeks and months. you'll also work closely with our Sales and Product Management Teams to understand our clients' needs and any change requests that drive our development efforts.
Proven track record of 4+ years of experience in software development, a substantial part of which was gained in a high-throughput, decision-automation related environment.
Ability to produce high quality code in Python.
Proven track record working with products from major cloud providers (AWS, GCP, Azure, etc.)
3+ years of experience in working with big data using technologies like Spark, Kafka, Flink, Hadoop, and NoSQL datastores.
2+ years of experience on distributed, high throughput and low latency architecture.
1+ year of experience deploying or managing data pipelines for supporting data-science-driven decisioning at scale.
•A successful track-record of manipulating, processing and extracting value from large, disconnected datasets.
Passion about testing, with experience on teams where automated building and testing are the norm.
Proven ability to communicate in both verbal and writing in a high performance, collaborative environment.
Knowledge of data development best practices, and an enjoyment helping others learn to do the same.
A belief that the best data pipelines run unattended for weeks and months on end.
Deep familiarity with version control, and with productive code reviews which help to catch bugs, improve the code base and spread knowledge.
Helpful, but not required:
Experience with large consumer data sets used in performance marketing is a major advantage.
Deep familiarity with machine learning libraries is a big plus.
Well-versed in (or contributes to) data-centric open source projects.
Reads Hacker News, blogs, or stays on top of emerging tools in some other way
Industry-specific marketing data
Technologies of Interest:
Languages/Libraries - Python, Java, Scala, Spark, Kafka, Hadoop, HDFS, Parquet.
Cloud - AWS, Azure, Google
All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability or protected veteran status, or any other legally protected basis, in accordance with applicable law.