Job Description
The Data Analyst is a developing professional global role. Deals with most problems independently and has some latitude to solve complex problems. Integrates in-depth specialty area knowledge with a solid understanding of industry standards and practices. Good understanding of how the team and area integrate with others in accomplishing the objectives of the sub function/ job family. Applies analytical thinking and knowledge of data analysis tools and methodologies. Requires attention to detail when making judgments and recommendations based on the analysis of factual information. Typically deals with variable issues with potentially broader business impact. Applies professional judgment when interpreting data and results. Breaks down information in a systematic and communicable manner. Developed communication and diplomacy skills are required in order to exchange potentially complex/sensitive information. Moderate but direct impact through close contact with the businesses' core activities. Quality and timeliness of service provided will affect the effectiveness of own team and other closely related teams.
Responsibilities:
- Design, develop applications on Big data platforms (Hadoop, Spark, Kafka etc.) with emphasis on performance, reliability and scalability
- Work with Business Analysts, Product managers and Data Scientists to build the process and infrastructure to support data driven reports
- Build, automate and maintain an optimized and highly available data pipelines
- Monitor and control all phases of development process - analysis, design, construction, testing, and implementation as well as provide support on applications to users
- Gathers operational data from various cross functional stakeholders to examine past business performance.
- Identifies data patterns & trends, and provides insights to enhance business decision making capability in business planning, process improvement, solution assessment etc.
- Recommends actions for future developments & strategic business opportunities, as well as enhancements to operational policies.
- May be involved in exploratory data analysis, confirmatory data analysis and/or qualitative analysis.
- Continuously improve processes and strategies by exploring and evaluating new data sources, tools, and capabilities
- Work closely with internal and external business partners in building, implementing, tracking and improving decision strategies
- Appropriately assess risk when business decisions are made, demonstrating particular consideration for the firm's reputation and safeguarding Citigroup, its clients and assets, by driving compliance with applicable laws, rules and regulations, adhering to Policy, applying sound ethical judgment regarding personal behavior, conduct and business practices, and escalating, managing and reporting control issues with transparency.
Qualifications:
5 years of Strong Experience in Big data / Modern world data processing technologies like Apache Spark, HDFS, Hive, HBase
- 5 years of Hands-on experience with Python, Pyspark, and Spark SQL
- Strong experience is data collection / processing / cleaning, and exploratory data analysis
- Deep knowledge and very strong in SQL, and Relational Databases
- Experience with all aspects of DevOps (source control, continuous integration, deployments, etc.)
- Experience working with real time data pipelines (Kafka, Apache Beam)
- Strong development/automation skills
- Knowledge of Unix shell scripting
- Hands-on experience with Apache Airflow
- Advanced process management skills, organized and detail oriented
- Curious about learning and developing new skillsets
- Positive outlook with a can-do mindset
Education:
- Bachelors/University degree or equivalent experience
Job ID: 29852