Empowering Healthcare Professionals with Secure, Offline-Ready AI that Transforms Medical Documentation While Keeping Patient Data Protected
Authors: Wei-Lin Wen, Yu-Yao Tsai
PrivNurse AI is an end-to-end, on-premises artificial intelligence system designed to combat one of the most pressing issues in modern healthcare: clinician burnout driven by administrative overload. By harnessing the unparalleled on-device efficiency and multimodal capabilities of Google's Gemma 3n, PrivNurse AI empowers nurses and physicians by automating and accelerating the creation of complex clinical documentation.
The system features three core modules:
- π©Ί Consultation Note Summarizer - Uses Chain-of-Thought reasoning to discern clinical priorities
- π Discharge Note Summarizer - Generates structured discharge summaries
- π€ Speech-to-Text Nursing Note Transcriber - Hands-free clinical documentation
Deployed entirely within a hospital's secure network, PrivNurse AI guarantees patient data privacy (HIPAA/GDPR compliance) while delivering clinically-validated, explainable, and continuously improving AI assistance, directly at the point of care.
Our one-month clinical deployment study at Kuang Tien General Hospital demonstrated exceptional results:
- π User Satisfaction: 9.17/10 average rating
- π Adoption Rate: >85% (exceeding industry benchmarks of 60-75%)
- β‘ Time Reduction: 91.7% reduction in documentation time (from 5 minutes to 25 seconds per consultation note)
- π₯ Study Participants: 39 nursing staff across 2 nursing stations
- π Records Processed: 401 consultation records
Our comprehensive training pipeline transforms the general-purpose Gemma 3n into a highly specialized clinical expert through four critical stages:
- Data Cleaning: Removal of incomplete records and inconsistencies
- Standardization: Uniform formatting across different record types
- Data Integration: Consolidation of multi-source medical records
- HIPAA Compliance: Complete de-identification following Safe Harbor guidelines
- Medical Structured Chain-of-Thought (MedSCoT): Claude-Sonnet-4 generates reasoning chains for consultation prioritization
- Medical Data Distillation: MedGemma-27B-IT creates structured discharge summaries with 5 essential elements
- Clinical Reasoning Integration: Teaching models how to think, not just what to write
- QLoRA Technique: 4-bit quantization with Low-Rank Adaptation
- Unsloth Optimization: 1.5x faster training with 50% less VRAM usage
- Four Specialized Models: Dedicated agents for summarization and validation tasks
- Model Merging: Integration of LoRA adapters with base Gemma-3n-E4B
- GGUF Conversion: Optimization for Ollama framework compatibility
- Q8_0 Quantization: 37.5% VRAM reduction (16GB β 10GB) without accuracy loss
Our innovative architecture deploys four specialized models through the Ollama framework:
Task A & B: Document Summarization
- Agent 1 (Summarizer): Generates clinical summaries using MedSCoT reasoning
- Agent 2 (Highlighter): Identifies source evidence and provides explainability
- JSON Match Processor: Creates bidirectional traceability between summaries and source text
Task C: Multimodal Speech Processing
- Gemma-3n Audio Processing: FastAPI-based microservice for real-time transcription
- Clinical Context Optimization: Specialized prompts for medical terminology
- Hands-free Documentation: Seamless integration into clinical workflows
| Component | Minimum | Recommended |
|---|---|---|
| CPU | 8+ cores | AMD Ryzen 9 / Intel i9 |
| RAM | 16GB | 32GB+ DDR4/DDR5 |
| GPU | 8GB VRAM | RTX 4060Ti/4090, A100, H100 |
| Storage | 50GB | 100GB+ NVMe SSD |
| Network | Stable connection | Gigabit Ethernet |
- OS: Ubuntu 20.04+
- Python: 3.8+
- Node.js: 18+
- MySQL: 5.7+
- Docker: 20.10+ (optional)
- Ollama: Latest version
- FFmpeg: For audio processing
- Google Gemma 3n - Base language model architecture
- Hugging Face Transformers - Model implementation and hosting platform
- Ollama - Local model deployment and inference framework
- FastAPI - Backend API development
- Next.js - Frontend web application framework
- React - User interface components
- MySQL - Database management system
- PyTorch - Deep learning framework
- Unsloth - Training optimization library
- QLoRA - Parameter-efficient fine-tuning
- llama.cpp - Model quantization and optimization
- FFmpeg - Audio processing
- Transformers - Model inference
- Material-UI & Chakra UI - Frontend component libraries
git clone https://github.com/weilin1205/PrivNurseAI.git
cd PrivNurseAI# Ubuntu/Debian
sudo apt update && sudo apt upgrade -y
sudo apt install -y python3 python3-pip python3-venv nodejs npm mysql-server
sudo apt install -y ffmpeg libsndfile1 libasound2-dev portaudio19-dev
# Install NVIDIA drivers (for GPU acceleration)
sudo apt install -y nvidia-driver-535 nvidia-cuda-toolkit
# Install Ollama and Start Ollama service
curl -fsSL https://ollama.com/install.sh | sh
ollama servecd backend
# Create virtual environment
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Configure environment
cp .env.example .env
# Edit .env with your MySQL credentials and configurationcd ../frontend
# Install dependencies
npm install
# Configure environment
cp .env.local.example .env.local
# Edit .env.local with your API URL# Start MySQL
sudo systemctl start mysql
# Create database
mysql -u root -p
CREATE DATABASE inference_db;
EXIT;# Terminal 1: Start Backend
cd backend
python main.py
# Terminal 2: Start Frontend
cd frontend
npm run dev
# Terminal 3: Start Audio API (optional)
cd ExpertAgentC_LLMServer_Nursing_Note_STT
./start_api.sh- Web Interface: http://localhost:3000
- API Documentation: http://localhost:8000/docs
- Audio API: http://localhost:8444 (if configured)
Default Login:
- Username:
admin - Password:
password
If you prefer to set up individual components, follow these guides:
cd Data_Preprocessing
pip install rich psutil
python PrivNurse_data_preprocessing.pycd Training_Data_Distillation
# Generate consultation datasets with Claude
python privNurse_consult_validation_claude.py
python privNurse_consult_summary_claude.py
# Generate discharge datasets with MedGemma
jupyter notebook PrivNurse_note_validation_medgemma.ipynb
jupyter notebook PrivNurse_note_summary_medgemma.ipynbcd FineTuning_Training
# Fine-tune all four specialized models
jupyter notebook FineTuning_Gemma3n_PrivNurse_consult_summary.ipynb
jupyter notebook FineTuning_Gemma3n_PrivNurse_consult_validation.ipynb
jupyter notebook FineTuning_Gemma3n_PrivNurse_note_summary.ipynb
jupyter notebook FineTuning_Gemma3n_PrivNurse_note_validation.ipynbcd ExpertAgentC_LLMServer_Nursing_Note_STT
chmod +x setup.sh
./setup.sh
cd gemma-audio-api
./start_api.shcd privnurse_gemma3n
# Follow backend and frontend setup as described above- Access to de-identified medical records
- Institutional Review Board (IRB) approval
- Computational resources (GPU recommended)
- Hugging Face access token
cd Data_Preprocessing
python PrivNurse_data_preprocessing.pycd Training_Data_Distillation
# For consultation tasks (requires Claude API access)
python privNurse_consult_validation_claude.py
python privNurse_consult_summary_claude.py
# For discharge tasks (requires MedGemma access)
jupyter notebook PrivNurse_note_validation_medgemma.ipynb
jupyter notebook PrivNurse_note_summary_medgemma.ipynbcd FineTuning_Training
# Train all four specialized models
jupyter notebook FineTuning_Gemma3n_PrivNurse_consult_summary.ipynb
jupyter notebook FineTuning_Gemma3n_PrivNurse_consult_validation.ipynb
jupyter notebook FineTuning_Gemma3n_PrivNurse_note_summary.ipynb
jupyter notebook FineTuning_Gemma3n_PrivNurse_note_validation.ipynb# Import trained models
ollama create gemma-3n-privnurse-consult-summary-v1 -f Modelfile_PrivNurse_Consultation_Summary_v1
ollama create gemma-3n-privnurse-consult-validation-v1 -f Modelfile_PrivNurse_Consultation_Validation_v1
ollama create gemma-3n-privnurse-note-summary-v1 -f Modelfile_PrivNurse_DischargeNote_Summary_v1
ollama create gemma-3n-privnurse-note-validation-v1 -f Modelfile_PrivNurse_DischargeNote_Validation_v1- β On-device Processing: All AI inference occurs locally
- β Data De-identification: Complete PII removal following Safe Harbor guidelines
- β Encrypted Storage: All patient data encrypted at rest
- β Audit Trails: Comprehensive logging of all system interactions
- β Access Controls: Role-based permissions and authentication
- π Firewall Configuration: Restricted port access
- π API Authentication: Bearer token security
- π Rate Limiting: Protection against abuse
- π HTTPS Support: Encrypted communications
# Backend API tests
cd backend
python -m pytest tests/
# Frontend component tests
cd frontend
npm test
# Audio API tests
cd ExpertAgentC_LLMServer_Nursing_Note_STT
python test_api.py# System resource monitoring
htop
nvidia-smi
# API response time testing
curl -w "@curl-format.txt" -o /dev/null -s http://localhost:8000/health# Check MySQL status
sudo systemctl status mysql
# Reset MySQL password
sudo mysql_secure_installation
# Verify database exists
mysql -u root -p -e "SHOW DATABASES;"# Check available models
ollama list
# Re-create models if corrupted
ollama create gemma-3n-privnurse-note-summary-v1 -f Modelfile_PrivNurse_DischargeNote_Summary_v1
# Verify Ollama service
curl http://localhost:11434/api/tags# Check GPU usage
nvidia-smi
# Clear GPU memory
sudo fuser -v /dev/nvidia*
sudo kill -9 <PID># Verify FFmpeg installation
ffmpeg -version
# Check audio file permissions
ls -la /path/to/audio/files
# Test audio processing
curl -X POST -F "audio_file=@test.wav" http://localhost:8444/generate/audio-text# Enable 8-bit quantization
export CUDA_VISIBLE_DEVICES=0
export PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:512-- Optimize MySQL for better performance
SET GLOBAL innodb_buffer_pool_size = 2G;
SET GLOBAL query_cache_size = 268435456;
CREATE INDEX idx_patient_id ON consultations(patient_id);# Increase file descriptor limits
echo "* soft nofile 65536" | sudo tee -a /etc/security/limits.conf
echo "* hard nofile 65536" | sudo tee -a /etc/security/limits.conf
# Optimize kernel parameters
echo "net.core.somaxconn = 65535" | sudo tee -a /etc/sysctl.conf
sudo sysctl -p- π Data Preprocessing Guide - Clean and prepare medical records
- π§ Training Data Generation - Create specialized training datasets
- π― Model Fine-tuning Guide - Train your clinical expert models
- π€ Speech-to-Text Server - Deploy multimodal capabilities
- π₯οΈ Main Application Guide - Run the complete system
- π Technical Report - Comprehensive technical details
- Backend API: http://localhost:8000/docs (Swagger UI)
- Audio API: http://localhost:8444/docs (FastAPI docs)
We welcome contributions from the healthcare AI community!
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
- Follow existing code style and conventions
- Add tests for new functionality
- Update documentation for API changes
- Ensure HIPAA compliance for any healthcare-related features
- π§ Additional clinical specialties (cardiology, oncology, etc.)
- π Multi-language support
- π± Mobile application development
- π§ Performance optimizations
- π§ͺ Additional test coverage
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
- Gemma 3n: Gemma Terms of Use
- Ollama: Apache License 2.0
- FastAPI: MIT License
- Next.js: MIT License
- Kuang Tien General Hospital - Clinical deployment and validation
- IRB Committee - Ethical oversight and approval (IRB no.: KTGH 1135)
- FastAPI, Next.js, React, and all open-source contributors
- The healthcare AI research community
- π Documentation: Complete Deployment Guide
- π Bug Reports: GitHub Issues
- π¬ Discussions: GitHub Discussions
For research collaborations, clinical partnerships, or enterprise deployment inquiries:
- π§ Email: Contact via GitHub Issues
- π Technical Report: Full PDF
If you use PrivNurse AI in your research, please cite:
@misc{wen2025privnurse,
title={PrivNurse AI: Revolutionizing Clinical Documentation with On-Device Intelligence},
author={Wei-Lin Wen and Yu-Yao Tsai},
year={2025},
url={https://github.com/weilin1205/PrivNurseAI}
}







