Hadoop/Big Data Engineer in Houston, TX at DISYS

Date Posted: 5/16/2018

Job Snapshot

Job Description

The Data & Analytics Specialist (Big Data) is a core member of the team responsible for providing solutions to digital age data challenges through design and implementation of an ecosystem of Hadoop, NoSQL, AI, Machine Learning and other modern data technologies on premise or in the Cloud. The solutions include re-engineering the data acquisition, storage, processing, security, data management, governance and analysis using these technologies leading to the implementation of a modern data platform. A solid experience and understanding of considerations for planning, scaling, deployment and operations that are unique to Hadoop, NoSQL and other emerging modern data technologies is required.

The primary result is the sustained improvements in reporting and analytics productivity, decision support and business performance.


This role will work closely with data scientists, platform and application specialists, and data source owners to deliver foundational data sets, enabling analytics solutions driving successful business outcomes. This role will also provide leadership defining and implementing best practices for efficiently sourcing and processing large data sets from Big Data Analytics data stores.

Contribute and evangelize the execution of Data and Analytics Vision and Strategy, in alignment with leadership priorities, business stakeholder requirements, and business unit requirements.

Work with Business and IT stakeholders to understand requirements, data sets, key success criteria, and issues which can be solved using analytics and/or data science techniques and provide accurate estimate.

Build real-time, reliable, scalable, high-performing, distributed, fault tolerant systems. Develop rapid prototyping to facilitate advanced analytics modeling including Machine Learning and AI processing.

Design and develop code, scripts and data pipelines that leverage structured and unstructured data. Implement measures to address data privacy, security, compliance and ensure robust data governance.

Monitor, maintain and optimize production systems. Investigate and resolve incidents reported by users. Identify opportunities to automate, consolidate and simplify platform.

Work with Enterprise Architect and other IT teams to develop and maintain data integrity, integration and governance standards.

Contribute to the selection of platforms, data management, libraries, tool chain and OSS for software development. Stay on top of evolving technology to suggest and prototype and implement improvements to the data architecture.

Collaborate with cross-functional teams to help utilize and drive adoption of new big data tools / models. Mentor business stakeholder and members of the data and analytics teams regarding technology and best practices.

Manage assigned activities within time, cost and technical objectives. May manage small projects with internal or external resources.

Job Requirements

Work Experience:

5+ years hands-on full life-cycle complex/large scale data warehouse and ETL design and development experience including more than 1 year of architecting, implementing and successfully operationalizing large scale data solutions in production environments using Hadoop and NoSQL ecosystem with many of the relevant technologies such as Azure Data Lake, Hive, Spark, Nifi, Kafka, HBase, Cassandra, etc.

Experience with global enterprise environment or major consulting firm is a plus.

Required Skills:

•         Strong foundation and a clear understanding and appreciation of Data and Analytics design and development techniques to build solutions across multiple tools and platform.

•         Hands-on architecting and industrializing data lakes or real-time platforms for an enterprise enabling business applications and usage at scale

•         Proven hands on experience designing, building, configuring and supporting Hadoop Ecosystem in a production environment

•         Data stores – architecting data and building performant data models at scale for Hadoop/NoSQL ecosystem of data stores to support different business consumption patterns off a centralized data platform

•         Data acquisition technologies including batch and streaming analytics architectures implemented in Hadoop (such as Spark Streaming, Storm, NIFI, Kafka, etc), including ETL from RDBMS and ERP to Hive with data validation to ensure data quality.

•         Spark/MR/ETL processing, including .NET, Java, Python, Scala, Talend; for data analysis of production Big Data applications

•         Cloud computing – such as Microsoft Azure

•         Familiarity with a broad base of analytical methods (one or more of the following):

•         Data integration and modeling (variable transformation & summarization, algorithm development)

•         Data processing (Spark, SQL Server, PostgreSql, Hadoop/Hive)

•         Experience in writing queries for moving data from HDFS to Hive and analyzing data.

•         Fine tune Hadoop applications (Hive and Spark) for high performance and throughput.

Preferred Skills:

•         Experience building data management (metadata, lineage, tracking etc.)  and governance solutions for modern data platforms that use Hadoop and NoSQL

•         Experience securing Hadoop/NoSQL based modern data platforms

•         Experience of working with models in SAP HANA, manufacturing historians, various IoT, subscription and public data sources.

•         Experience of ETL tools such as SAP BusinessObjects Data Services.

•         Experience of Reporting and Analytics tools such as Tableau, Power BI, R and Python on Big Data.

•         Understanding of Statistics and mathematical techniques to solve real business problems.

•         Functional experience in one of more business functions.

Digital Intelligence Systems, LLC. is an Equal Opportunity Employer, M/F/D/V. We do not discriminate against any employee or applicant because they inquired about, discussed, or disclosed compensation. Email recruitinghelp @ disys.com to contact us if you are an individual with a disability and require accommodation in the application process.