Data Engineer/Senior Data Engineer

salesforce

Your Impact:

  • Be responsible for the technical solution design, lead the technical architecture and implementation of data acquisition and integration projects, both batch and real time
  • Define the overall solution architecture needed to implement a layered data stack that ensures a high level of data quality and timely insights
  • Communicate with product owners and analysts to clarify requirements
  • Craft technical solutions and assemble design artifacts (functional design documents, data flow diagrams, data models, etc.)
  • Build data pipelines data processing tools and technologies in open source and proprietary products
  • Serve the team as a domain expert & mentor for ETL design, and other related big data and programming technologies
  • Identify incomplete data, improve quality of data, and integrate data from several data sources
  • Proactively identify performance & data quality problems and drive the team to remediate them. Advocate architectural and code improvements to the team to improve execution speed and reliability
  • Design and develop tailored data structures
  • Reinvent prototypes to create production-ready data flows
  • Support Data Science research by designing, developing, and maintaining all parts of the Big Data pipeline for reporting, statistical and machine learning, and computational requirements
  • Perform data profiling, sophisticated sampling, statistical testing, and testing of reliability on data
  • Clearly articulate pros and cons of various technologies and platforms in open source and proprietary products Implement proof of concept on new technology and tools to help the organization pick the best tools and solutions
  • Strong SQL optimization and performance tuning experience in a high volume data environment that uses parallel processing
  • Teams are using the following: SQL, Python, Airflow, AWS, Spark, Tableau, AWS EMR, Snowflake
  • Participate in the team’s on-call rotation to address sophisticated problems in real-time and keep services operational and highly available

Required Skills:

  • 4 – 12 years experience in data engineering
  • Build programmatic ETL pipelines with SQL based technologies and platforms
  • Solid understanding of databases, and working with sophisticated datasets
  • Data governance, verification and data documentation using current tools and future adopted tools and platform
  • Work with different technologies (Python, shell scripts) and translate logic into well-performing SQL
  • Perform tasks such as writing scripts, web scraping, getting data from APIs etc.
  • Automate data pipelines using scheduling tools like Airflow
  • Experience with CI/CD technologies and tools like Jenkins, Ant or Gradle, Github
  • Be prepared for changes in business direction and understand when to adjust designs
  • Experience writing production level SQL code and good understanding of Data Engineering pipelines
  • Experience with Hadoop ecosystem and similar frameworks
  • Previous projects should display technical leadership with an emphasis on data lake, data warehouse solutions, business intelligence, big data analytics, enterprise-scale custom data products
  • Knowledge of data modeling techniques and high-volume ETL/ELT design
  • Experience with version control systems (Github, Subversion) and deployment tools (e.g. continuous integration) required
  • Experience working with Public Cloud platforms like GPC, AWS, or Snowflake
  • Ability to work effectively in an unstructured and fast-paced environment both independently and in a team setting, with a high degree of self-management with clear communication and commitment to delivery timelines
  • A related technical degree required

To apply for this job please visit salesforce.wd12.myworkdayjobs.com.

Related Jobs