Data Integration Engineers
Data Integration Engineers
MULTIPLE POSITIONS: Build and maintain scalable, repeatable and secure data pipelines that serve multiple users. Responsible for facilitating obtaining data from variety of sources, formatting data and assuring adheres to data quality, privacy, and security standards and quickly available to downstream users. Design and build scalable, low-latency, fault-tolerant streaming data platform empowering end users to extract meaningful and timely insights from data assets. Work with business and technology stakeholders to build next generation Distributed Streaming Data Pipelines and Analytics Data Stores using streaming frameworks (e.g. Flink, Spark Streaming, etc.). Build platforms that facilitate gathering and collecting data, storing it, performing real-time processing on it and serving data to end users and decision - making systems. Assist in driving adoption of Data Analytics platform for larger data strategy while maintaining on-going understanding of emerging data management technologies, industry trends and best practices. Identify ways to improve data reliability, efficiency and quality.
Minimum Requirements: Bachelor's degree or equivalent in Computer Science, Management Information Systems, Data Governance, Data Management or a closely related field with 5 yrs of data or software engineering experience, including hands-on production experience with distributed stream processing frameworks (Kafka, Spark Streaming, Storm), building a robust, fault-tolerant data pipeline that cleans, transforms, and aggregates unorganized data into databases or data sources and distributed data systems; experience with micro-services, deployment platforms (Kubernetes), relational (SQL Server or Oracle, etc.) and non-relational (Cassandra and Mongo DB) databases, data warehousing principles, schema design, data governance, database security, Agile or other rapid application development methods, object-oriented design, coding and testing patterns; experience in engineering (commercial or open source) software platforms and large-scale data infrastructures and in performance tuning and optimization and bottleneck problem analysis; and knowledge of data modeling, data structures and their benefits and limitations under particular use cases and programming or scripting languages.