sciwork 2023

Mars Su

Linkedin

https://www.linkedin.com/in/huei-yuan-su-a16458134/

Work Experience
  • Trendmicro - Staff Data Engineer (Present)
  • Gogolook - Sr. Data/ML Engineer
  • Wavenet - Data Scientist
  • adGeek - Advertising AI Engineer
Important Experience
  • PyCon APAC 2022 Speaker
  • Publish of Book - Apache NiFi 讓你輕鬆設計 Data Pipeline
  • itHome 2021 AI&Data - Champion
Education
  • National Taiwan University of Science and Technology - Master's Degree, IM
  • National Changhua University of Education - Bachelor's Degree, IM
The speaker's profile picture

Sessions

12-09
14:20
30min
Data Lakehouse Architecture Evolution and Future
Mars Su

In the current data-driven world, we are always face on large data volumn storage, analytics and machine-learning application problem. In ths past, we always use database, data lake or data warehouse to store different data, includes structured data, unstructured data or semi-structured data. Although current have many related storage and tool can solve corresponding problems and scenraio, still have some limitation and imperfection.

In order to improve these, one concept gradually is discussed in these year. That is a Lakehouse, which integrate data lake and data warehouse advantages so that become a powerful architecture to implement modern data stack. Based on this concept, have some completed service and tool can implement it. Includes Databricks - Delta Lake, Apache Iceberg or Apache Hudi.

In this session, i will quickly describe and analyze these concept, benefits and drawbacks about database, data lake, data warehouse and lakehouse. And introduce some represent service. Lastly, i will show some demo about lakehouse so that attendees can more understand it specifically.

Track
NYCU