AsyncFileSystem
and SyncFileSystem
for synchronous and asynchronous file operations. The app shows how Ray Remote Execution can be used for distributed file processing across multiple nodes.
Overview
The Upload App is an example of:- Forms with File Upload: Using
InputFiles
for multiple file uploads - Ray Remote Execution: Parallel processing of files across distributed nodes
- AsyncFileSystem vs SyncFileSystem: Demonstration of the differences between synchronous and asynchronous file system operations
- File Download/Upload across Nodes: Ray Remote Functions enable file operations across different nodes
Step-by-Step Implementation
1. Imports and Setup
ray.remote
: Decorator for Remote Functionsray.serve
: For service deploymentkodosumi.core
: Core components for Kodosumi appspptx.Presentation
: For PowerPoint file processing (example only)
2. Ray Remote Function for File Processing
@remote
: Makes the function a Ray Remote Functiontracer.fs_sync()
: Uses SyncFileSystem since Ray Remote Functions are always synchronousfs.download(file)
: Downloads file from Kodosumi File Systemfs.upload(str(md_file))
: Uploads processed file- Error Handling: Robust handling of processing errors
3. Main Function with AsyncFileSystem
await tracer.fs()
: Uses AsyncFileSystem for asynchronous operationsawait afs.ls("in")
: Asynchronous file listingprocess_file.remote()
: Starts Remote Functionsasyncio.gather(*futures)
: Waits for all Remote Functionsawait afs.close()
: Properly closes AsyncFileSystem
4. Form Definition
F.InputFiles
: Multiple file uploadF.Checkbox
: Option to ignore errorsF.Submit/F.Cancel
: Action buttons
5. Endpoint Registration
6. Ray Serve Deployment
AsyncFileSystem vs SyncFileSystem
AsyncFileSystem
- Usage: In asynchronous functions (
async def
) - Methods:
await fs.ls()
,await fs.upload()
,await fs.close()
- Advantages: Non-blocking, better performance for I/O operations
- Example: Main function
run()
SyncFileSystem
- Usage: In synchronous functions (Ray Remote Functions)
- Methods:
fs.ls()
,fs.upload()
,fs.close()
- Advantages: Easier to use, compatible with Ray Remote Functions
- Example:
process_file()
Remote Function
Ray Remote Execution
Why Ray Remote Functions?
- Distributed Processing: Files can be processed on different nodes
- Scalability: Automatic load balancing
- Fault Tolerance: Error handling at node level
- Resource Management: Efficient use of CPU and memory
Remote Function Lifecycle
- File Upload → Kodosumi File System
- AsyncFileSystem.ls() → Get file list
- process_file.remote() → Start Remote Function
- SyncFileSystem.download() → Download file to remote node
- File Processing → PowerPoint to text
- SyncFileSystem.upload() → Upload result
- Result Return → To main function
Running the Upload Example
This section provides step-by-step instructions for running the Upload App example from thekodosumi-examples
repository.
Prerequisites
- Python Environment: Python 3.12 or higher
- Ray Cluster: Running Ray cluster (local or distributed)
- Kodosumi: Installed and configured
- Dependencies: Required packages for the example
Setup Instructions
1. Clone the Examples Repository
2. Install Dependencies
3. Start Ray Cluster
4. Deploy the Upload Example
Create a deployment configuration filedata/config/upload_example.yaml
and a Ray serve deployment file data/config/config.yaml
: