Skip to main content

intro

ben.wangzAbout 3 min

Wang Zhi

  • email: ben.wangz@foxmail.com
  • blog: https://blog.geekcity.tech

Summary

  • experienced Java developer with 8+ years of expertise in designing and implementing scalable data processing platforms.
  • proficient in Apache Flink, Kubernetes, and handling large volumes of data.
  • proven ability to lead teams and deliver high-performance systems.
  • achieved developing data processing platforms for various companies, including Alibaba Group, ZhejiangLab and tianrang-inc.

Technical Skills

  • Java is my primary language for coding.
  • familiar with Kubernetes, especially for storages, ci/cd workflows, flink on k8s and so on. please check my blog for details.
  • familiar with stream processing, especially for Apache Flink.

Experience

astronomy center of ZhejiangLab, Senior Engineer(2023 - now)

  • leading a team of 5 people
  • designing and developing a data platform
    • to support CSST(China Space Station Telescope)
    • to support Cosmic Antenna
    • related techniques: check data-lakeopen in new window in my blog for details

big data center of ZhejiangLab, Senior Engineer(2021 - 2023)

  • leading a team of 5-10 people
  • maintaining old data processing platform and designing a new one with pluggable architecture
    • data formats
      • old platform: csv uploaded and tables from database
      • new platform: driven by calcite framework which support csv from s3, tables from common relational databases, graphs from neo4j/jena/rdf(TURTLE), etc.
    • data management
      • old platform: data is managed by postgresql/greenplum
      • new platform: built-in s3(s3) and remote s3, relational databases
    • algorithms
      • old platform: only built-in algorithms
      • new platform: pluggable and custom algorithms, which can be written in java, python, etc.
    • next(not finished) features for new platform
      • executing engines to support flink jobs which can be compatible with current calcite framework
      • pluggable applications
    • open source codes
      • old platform: https://gitee.com/zhijiangtianshu/nebula
      • new platform: https://github.com/lab-zj/data-hub
  • maintaining old k8s cluster and designing a new one which is ready for production
    • all softwares managed by helm charts
    • less than one person per month for maintaining three clusters, which contains more than 300 services

天壤智能, Professional Data Processing Engineer(2019 - 2021)

  • leading a team of 3-5 people
  • designing and implementing a data processing platform named pandora, which based on flink, for analyzing advertisement putting data
    • refactoring business logic into small pieces with table and sql api of flink for better maintainability
    • reducing costs with flink on k8s
    • handling both batch and stream data with oss and kafka as the storage
  • refactoring pipelines and code of feature extracting and predicting algorithms for a recommendation system at China Merchants Bank
    • optimization for spark jobs
    • designing interfaces for algorithms
  • main reasons to leave: my lovely daughter was coming, the business was given up and my boss left this company

search dump team at Search Department Alibaba Group, Senior Java Developer(2015 - 2019)

  • maintaining and developing data processing logic and search engine dump systems for multiple business lines
  • designing and implementing an image data processing platform for image searching in Taobao: pailitao is the main user of this platform
    • establishing standards for c++ image processing modules, which based on protobuf, JNI, cmake, etc.
    • making standalone image processing modules, provided by algorithm engineers, to be a distributed system
    • handling modules like image feature extraction, image similarity calculation, clustering, etc.
    • keeping no online fault for more than 3 years after the platform was put into use
  • maintaining a data processing platform for Taobao/Tmall main search
    • online fault not larger than P3
    • only two people, including me, handled the whole taobao/tmall data for searching for nearly 2 years
  • maintaining data processing platforms for 1688, aliexpress, etc.
    • keeping no online fault
  • kpi: 3.75 all the time except 3.5+(once)
  • main reason to leave: deep dive into stream processing and k8s

search dump team at Search Department Alibaba Group, intern(2014.07 - 2015.04)

  • maintaining data processing platform for search engine
  • Independently developed a distributed data synchronization tool, from MySQL to HBase, which was secure, high performance and widely used for more than 4 years

papers published

  • Song, Jie, HongYan He, Zhi Wang, Ge Yu, and Jean-Marc Pierson. "Modulo based data placement algorithm for energy consumption optimization of MapReduce system." Journal of Grid Computing 16 (2018): 409-424.
  • 宋杰, 王智, 李甜甜, and 于戈. "一种优化 MapReduce 系统能耗的数据布局算法." 软件学报 26, no. 8 (2015): 2091-2110.
  • Song, Jie, Tiantian Li, Zhi Wang, and Zhiliang Zhu. "Study on energy-consumption regularities of cloud computing systems by a novel evaluation model." Computing 95 (2013): 269-287.
  • Song, Jie, Chaopeng Guo, Zhi Wang, Yichan Zhang, Ge Yu, and Jean-Marc Pierson. "HaoLap: A Hadoop based OLAP system for big data." Journal of Systems and Software 102 (2015): 167-181.
  • 郭朝鹏, 王智, 韩峰, 张一川, and 宋杰. "HaoLap: 基于 Hadoop 的海量数据 OLAP 系统." 计算机研究与发展 S1 (2013): 378-383.
  • 宋杰, 侯泓颖, 王智, and 朱志良. "云计算环境下改进的能效度量模型." 浙江大学学报: 工学版 1 (2013): 44-52.

Education

  • NorthEastern University (2009 - 2013), Bachelor's degree of Software Engineering
  • NorthEastern University (2013 - 2015), Master's degree of Software Engineering

Certifications

additional

  • Adept at solving problems in large-scale/distributed data processing systems
  • Capable of designing and building search dump data processing platforms, CI/CD automation workflows, etc.
  • Able to lead/mentor small product research and development team
  • Can also develop microservice architectures using Spring Boot, but not an expert
  • Can also use simple algorithm models, but not proficient