Altiscale

Hadoop Engineer

Engineering | Palo Alto, CA, United States

We’re an Infrastructure-as-a-service company that provides Hadoop innovation and transformation into the cloud. We’re going to change the way developers and data scientists interact with Hadoop— in a way that will change the game in Big Data. You’ll be part of a smart, passionate, like-minded team – and meet who could shape the rest of your career. And that doesn’t happen very often, even here in the Silicon Valley

Altiscale is looking for Hadoop Engineer who want to break new ground in the Hadoop world. We are looking for experienced and motivated software engineers to help us build highly scalable and robust next generation Hadoop stack needed for our multi-tenanted cluster. The right engineer will work on one or more components of the stack including Core (Common, HDFS, YARN, MapReduce), Pig, HCatalog, Oozie, Hive, HBase, and overall Hadoop performance.

If you are energized and excited about working on Open Source, difficult distributed systems problems, and creating the next generation of Hadoop infrastructure, come join us!
 
Responsibilities
  • Architect, engineer, test, document, and deploy the core systems of the Altiscale Hadoop as a Service offering
  • Improve and extend core components of the Hadoop ecosystem (HDFS, YARN, Hive, Pig, Oozie, etc.) to work at scale on our multi-tenanted cluster
  • Work with Product Management to understand, design, and implement core features and customer-driven enhancements
  • Engage with customers to help them maximize the use of our services as part of a small and growing agile team
Skills & Requirements

We’re for looking people with some or all of these attributes:
  •  Obsessive about great engineering and Open Source
  • Expert level proficiency in one or more of the following languages: Java, Ruby, C/C++, Python
  • Deep understanding and experience with Hadoop internals (MapReduce (YARN), HDFS), Pig, HCatalog, Oozie, Hive, HBase; Hadoop committer a big plus
  • In depth understanding of two or more of:
  • System schedulers, data storage and management, virtualization, workload management, availability, scalability and distributed data platforms, virtualization, security infrastructure
  • Deep understanding and experience with Linux internals, virtual machines, and open source tools/platforms
  • Experience building large-scale distributed applications and services
  • Sound knowledge of SQL & No-SQL databases
  • Experience developing and extending modern build-chain tools, such as Maven, Jenkins, Git
  • Strong grasp of algorithms and data structure fundamentals and demonstrated analytical and problem solving skills, particularly those that apply to a “Big Data” environment
  • Experience with agile development methodologies
  • Knowledge of industry standards and trends
  • Excellent written/oral communication skills
  • 3-5 software development experience
  • Advanced degree in computer science and engineering (or equivalent)