Smaato - Staff Data Engineer
Portugal, Lisboa, LisbonEngineering
Smaato which is now a part of Verve Group (www.verve.com), which has created a more efficient and privacy-focused way to buy and monetize advertising. Verve Group is an ecosystem of demand and supply technologies fusing data, media, and technology together to deliver results and growth to both advertisers and publishers–no matter the screen or location, no matter who, what, or where a customer is. With 30 offices across the globe and with an eye on servicing forward-thinking advertising customers, Verve Group’s solutions are trusted by more than 90 of the United States’ top 100 advertisers, 4,000 publishers globally, and the world’s top demand-side platforms. Verve Group is a subsidiary of Media and Games Invest (MGI).
The Data Engineering team at Smaato (part of Verve Group) works on a few very interesting technology problems related to big-data, distributed computing, low-latency, analytics, visualization, machine learning, and highly scalable platforms development. We build reliable, peta bytes scale distributed systems using technologies such as Spark, Hadoop, Apache Flink, Airflow, Kafka, and Druid. As part of Smaato, you will work on the applications where all threads come together: Streaming, Processing, Storage the Ad exchange. Our ultra-efficient exchange processes more than 125 billion ad requests daily, e.g. whopping 3.5 trillions in a month. Every line of code you write matters as it is likely to be executed several billion times a day. We are one of the biggest AWS users with a presence in four different regions.
Smaato’s data platform is a symphony of streams, orchestrations, micro services, schedulers, analytics, and visualization technologies. Platform is supported by polyglot persistence using Druid, DynamoDB, Vertica, MySQL, and bunch of orchestration & streaming frameworks like Airflow, Spark, Flink.
The job will involve constant feature development, performance tuning and platform stability assurance. The mission of our analytics team is “data driven decisions at your fingertips”. You own and provide the system that all business decisions will be based on. Precision and high-quality results are essential in this role.
Our engineers are passionate problem solvers. Be it Java, Python, Scala, Groovy, or typescripting, we are up for all games :)
What You’ll Do
- Design and architect data platform components supporting millions of requests per second.
- This role is 70% hands on, enhancing and supporting data pipeline components.
- Lead small team, coordinate with product team, lead automations, review code.
- Delivering high quality software, meeting timelines, and being agile.
- Maintain the current platform and shape it’s evolution by adopting new technologies.
- Closely collaborate with stakeholders to gather requirements and design new product features in a short feedback loop.
What We'll Need
- 12+ years of experience in Big-data platforms, or distributed system, with deep understanding of Apache Druid, AWS, Spark, and Airflow.
- Exposure to highly scalable, low latency microservices development.
- Strong exposure to application, enterprise, and microservice design patterns.
- Strong understanding of computer science fundamentals, algorithms, and data structures.
- Proficient in one or more programming languages – java, python, scala.
- Exposure to AWS, Automation, and DevOps (Kubernetes, Jenkins CICD pipelines).
- Experience of leading small team of highly productive engineers.
- Proven experience in owning the products and driving them end to end, all the way from gathering requirements, development, testing, deployment to ensuring high availability post deployment.
- Contribute to architectural and coding standards, evolving engineering best practices.
- Nice to have - Open-source committer / contributor / PMC in any of these Apache big data open-source technologies.
- You enjoy operating your applications in production and strive to make on-calls obsolete, debugging issues in their live/production/test clusters and implement solutions to restore service with minimal disruption to business. Perform root cause analysis.