This article explains the process of how to migrate existing Pandas Workflows to Snowpark Pandas API, allowing for efficient scaling up of data processing needs without needing a full code rewrite. It is a pretty much lift and shift approach to have the data processing workflows up and running in minimal time and in a highly secure environment.
Prerequisites
- Expertise in Python Scripting of versions 3.8 and up
- Knowledge of basic and complex SQL for scripting
- Snowflake Account
- Snowflake Warehouse Usage permissions
- AWS S3/Cloud External Stage and Access Integration
Introduction
Pandas has been the go-to library for data manipulation and analysis. As datasets grow in volume and variety, the traditional Pandas can have implications with memory limitations and performance bottlenecks. Snowpark Pandas API — a promising tool that brings the power of distributed computing to the Pandas API, within the secure environment of Snowflake.