Store Procedure
A stored procedure is a precompiled set of SQL statements stored and executed within a database. It performs specific operations such as retrieving, inserting, updating, or deleting data. Essentially, it is reusable SQL code that can be called multiple times, helping to maintain consistency and reduce redundancy.1
User-Defined Function (UDF)
A user-defined function is a custom function written by a user to perform specific calculations, transformations, or logic that are not supported by built-in functions. UDFs always return a value and can be used directly in SQL queries.2
Tasks
A Snowflake Task is an object used to schedule and automate the execution of SQL statements or stored procedures within Snowflake. It enables recurring operations and facilitates automated data pipelines.3
- Tasks can be scheduled to run at specific times.
- Tasks can also be triggered by events, such as new data arriving in a stream.
Streams
A stream in Snowflake is used for Change Data Capture (CDC). It tracks DML changes (inserts, updates, deletes) made to a table and maintains metadata about these changes.4
- A stream creates a “change table” that logs what changed in the source table, at the row level, between two transactional points.
- This allows users to query and consume only changed data in a transactional and incremental manner.
Key Differences Between UDFs, Tasks, Streams, Stored Procedures
| Feature | Stored Procedure | User-Defined Function | Tasks | Stream |
| Parameters | Accepts parameters | Accepts parameters | Does not accept parameters | The only parameter is the source table |
| Execution | Executed manually | Executed manually within SQL | Scheduled or triggered | Automatic based on changes for the source table |
| Return Values | May return values (implicitly NULL if not) | Must return a value | Does not return values | Data reflected in the stream table |
| SQL Integration | Return values cannot be used directly in SQL statements | Return values are directly usable in SQL | ||
| SQL Usage | Can be used directly in FROM or Need command CALL | Can be used directly in SELECT | Cannot be used directly in SELECT | Creates a table called stream table |
Example: How to use all
Let’s create a stream
-- 1. Create the base table
CREATE OR REPLACE TABLE orders (
id INT,
order_name VARCHAR
);
-- 2. Table to hold processed results
CREATE OR REPLACE TABLE orders_with_tip (
id INT,
order_name VARCHAR
);
-- 3. Create a stream on the 'orders' table
-- Note: Append-only to track only inserts
CREATE OR REPLACE STREAM orders_stream
ON TABLE orders
APPEND_ONLY = TRUE;
-- 4. Create a user-defined function (UDF)
-- This adds a prefix "order_" to any integer ID
CREATE OR REPLACE FUNCTION order_to_string(i INT)
RETURNS VARCHAR
LANGUAGE PYTHON
RUNTIME_VERSION = '3.9'
HANDLER = 'order_to_string_py'
AS $$
def order_to_string_py(i):
return 'order_' + str(i)
$$;
-- 5. Create a stored procedure to clean rows from 'orders' table
-- where order_name does NOT start with 'order_'
CREATE OR REPLACE PROCEDURE cleaning_old_orders(target_table STRING)
RETURNS STRING
LANGUAGE PYTHON
RUNTIME_VERSION = '3.9'
PACKAGES = ('snowflake-snowpark-python')
HANDLER = 'clean'
AS
$$
def clean(session, target_table):
# Remove rows that do not start with 'order_'
df = session.table(target_table)
df_filtered = df.filter(~df["order_name"].startswith("order_"))
df_filtered.delete()
return "SUCCESS"
$$;
-- 6. Create a task that runs every minute if the stream has data
-- It inserts new rows into 'orders_with_tip' and calls the cleaning procedure
CREATE OR REPLACE TASK process_orders_task
WAREHOUSE = mywh
SCHEDULE = '1 MINUTE'
WHEN system$stream_has_data('orders_stream')
AS
BEGIN
-- Insert new rows into target table using UDF
INSERT INTO orders_with_tip (id, order_name)
SELECT id, order_to_string(id)
FROM orders_stream;
-- Call stored procedure to clean up 'orders' table
CALL cleaning_old_orders('orders');
END;
-- 7. Activate the task
ALTER TASK process_orders_task RESUME;