Loading…

*PLEASE NOTE: ALL SESSION TIMES ARE LISTED IN UTC by default*

We recommend changing the setting to your local timezone by going to the "Timezone" drop down menu on the right side of this page

***You will not be able to view any session streaming links unless you are REGISTERED and LOGGED in to Sched.***Register at connect.linaro.org and you will receive an invite from Sched.com to login.


Back To Schedule
Tuesday, September 22 • 11:45am - 12:10pm
LVC20-101 Make BigData fly on Arm64 - Apache Arrow

Log in to save this to your schedule, view media, leave feedback and see who's attending!

Slack channel to chat with speaker: https://linaroconnect.slack.com/archives/C01AKEK4AVD

Description:
There are lots of data formats in the BigData world such as parquet file with Python(pandas), Spark dataframe, JSON, Avro, CSV, etc.

It would waste about 70-80% computation on data conversion and serialization/deserialization among different projects.

Apache Arrow addresses these issues and facilitates communication between many components with its high-speed in-memory representation for flat and hierarchical data. It would help to get 10-100x speedup on In-Memory analytics workloads.

Collaborating with Linaro LDCG, we validated Apache Arrow on Arm64 and delivered the Arm-related optimization for Arrow.
This session will cover an overview of Apache Arrow, a brief introduction to Arrow optimization with Arm crypto and Neon extension and patches status submitted to the community. You will see the benchmark statistics results and how to take advantage of ARMv8 characteristics to make your data fly.

Speakers
avatar for YUQI GU

YUQI GU

Senior software engineer, Arm
Yuqi Gu currently works on Arm, serving as the committer for Apache Bigtop project. He is also an active contributor in Apache Arrow, MariaDB and RocksDB mainly focusing on performance optimization on Arm64.


Tuesday September 22, 2020 11:45am - 12:10pm UTC
[Track 3] DataCenter