Skip to content

weilin1205/PrivNurseAI

Repository files navigation

PrivNurse AI: Revolutionizing Clinical Documentation with On-Device Intelligence

Empowering Healthcare Professionals with Secure, Offline-Ready AI that Transforms Medical Documentation While Keeping Patient Data Protected

Demo Video Technical Report License Python FastAPI Next.js

Authors: Wei-Lin Wen, Yu-Yao Tsai


πŸš€ Executive Summary

PrivNurse AI Architecture

Figure 1: PrivNurse AI System Architecture Overview

PrivNurse AI is an end-to-end, on-premises artificial intelligence system designed to combat one of the most pressing issues in modern healthcare: clinician burnout driven by administrative overload. By harnessing the unparalleled on-device efficiency and multimodal capabilities of Google's Gemma 3n, PrivNurse AI empowers nurses and physicians by automating and accelerating the creation of complex clinical documentation.

🎯 Core Features

The system features three core modules:

  1. 🩺 Consultation Note Summarizer - Uses Chain-of-Thought reasoning to discern clinical priorities
  2. πŸ“‹ Discharge Note Summarizer - Generates structured discharge summaries
  3. 🎀 Speech-to-Text Nursing Note Transcriber - Hands-free clinical documentation

Deployed entirely within a hospital's secure network, PrivNurse AI guarantees patient data privacy (HIPAA/GDPR compliance) while delivering clinically-validated, explainable, and continuously improving AI assistance, directly at the point of care.

πŸ“Š Clinical Results

Our one-month clinical deployment study at Kuang Tien General Hospital demonstrated exceptional results:

  • πŸ† User Satisfaction: 9.17/10 average rating
  • πŸ“ˆ Adoption Rate: >85% (exceeding industry benchmarks of 60-75%)
  • ⚑ Time Reduction: 91.7% reduction in documentation time (from 5 minutes to 25 seconds per consultation note)
  • πŸ‘₯ Study Participants: 39 nursing staff across 2 nursing stations
  • πŸ“ Records Processed: 401 consultation records

🎬 System Demonstration

Patient Management System

Patient Management System

Comprehensive patient data management with intuitive interface

Consultation Note Summarization

Consultation Note Summarization

AI-powered consultation note summarization with explainable highlighting

Discharge Note Summarization

Discharge Note Summarization

Structured discharge summary generation from comprehensive medical records

Speech-to-Text Transcription

Speech-to-Text

Hands-free nursing documentation via advanced speech recognition

AI Model Management

AI Model Management

Centralized management of specialized clinical AI models

User Management

User Management

Role-based access control for healthcare professionals


πŸ—οΈ System Architecture

Training Pipeline: Forging a Clinical Expert

Training Pipeline

Figure 2: Advanced Training Pipeline Architecture

Our comprehensive training pipeline transforms the general-purpose Gemma 3n into a highly specialized clinical expert through four critical stages:

Stage 1: Data Preprocessing & De-identification

  • Data Cleaning: Removal of incomplete records and inconsistencies
  • Standardization: Uniform formatting across different record types
  • Data Integration: Consolidation of multi-source medical records
  • HIPAA Compliance: Complete de-identification following Safe Harbor guidelines

Stage 2: Intelligent Data Augmentation

  • Medical Structured Chain-of-Thought (MedSCoT): Claude-Sonnet-4 generates reasoning chains for consultation prioritization
  • Medical Data Distillation: MedGemma-27B-IT creates structured discharge summaries with 5 essential elements
  • Clinical Reasoning Integration: Teaching models how to think, not just what to write

Stage 3: Parameter-Efficient Fine-Tuning (PEFT)

  • QLoRA Technique: 4-bit quantization with Low-Rank Adaptation
  • Unsloth Optimization: 1.5x faster training with 50% less VRAM usage
  • Four Specialized Models: Dedicated agents for summarization and validation tasks

Stage 4: Deployment Optimization

  • Model Merging: Integration of LoRA adapters with base Gemma-3n-E4B
  • GGUF Conversion: Optimization for Ollama framework compatibility
  • Q8_0 Quantization: 37.5% VRAM reduction (16GB β†’ 10GB) without accuracy loss

Application Pipeline: AI at the Clinician's Fingertips

Application Pipeline

Figure 3: Real-time Clinical Application Architecture

Dual-Agent Inference System

Our innovative architecture deploys four specialized models through the Ollama framework:

Task A & B: Document Summarization

  • Agent 1 (Summarizer): Generates clinical summaries using MedSCoT reasoning
  • Agent 2 (Highlighter): Identifies source evidence and provides explainability
  • JSON Match Processor: Creates bidirectional traceability between summaries and source text

Task C: Multimodal Speech Processing

  • Gemma-3n Audio Processing: FastAPI-based microservice for real-time transcription
  • Clinical Context Optimization: Specialized prompts for medical terminology
  • Hands-free Documentation: Seamless integration into clinical workflows

πŸ”§ Technical Requirements

Hardware Requirements

Component Minimum Recommended
CPU 8+ cores AMD Ryzen 9 / Intel i9
RAM 16GB 32GB+ DDR4/DDR5
GPU 8GB VRAM RTX 4060Ti/4090, A100, H100
Storage 50GB 100GB+ NVMe SSD
Network Stable connection Gigabit Ethernet

Software Requirements

  • OS: Ubuntu 20.04+
  • Python: 3.8+
  • Node.js: 18+
  • MySQL: 5.7+
  • Docker: 20.10+ (optional)
  • Ollama: Latest version
  • FFmpeg: For audio processing

πŸ› οΈ Technology Stack

Core Technologies

  • Google Gemma 3n - Base language model architecture
  • Hugging Face Transformers - Model implementation and hosting platform
  • Ollama - Local model deployment and inference framework

Development Frameworks

  • FastAPI - Backend API development
  • Next.js - Frontend web application framework
  • React - User interface components
  • MySQL - Database management system

AI/ML Infrastructure

  • PyTorch - Deep learning framework
  • Unsloth - Training optimization library
  • QLoRA - Parameter-efficient fine-tuning
  • llama.cpp - Model quantization and optimization

Supporting Libraries

  • FFmpeg - Audio processing
  • Transformers - Model inference
  • Material-UI & Chakra UI - Frontend component libraries

πŸš€ Quick Start Installation

Method 1: Complete System Deployment (Recommended)

1. Clone Repository

git clone https://github.com/weilin1205/PrivNurseAI.git
cd PrivNurseAI

2. Install System Dependencies

# Ubuntu/Debian
sudo apt update && sudo apt upgrade -y
sudo apt install -y python3 python3-pip python3-venv nodejs npm mysql-server
sudo apt install -y ffmpeg libsndfile1 libasound2-dev portaudio19-dev

# Install NVIDIA drivers (for GPU acceleration)
sudo apt install -y nvidia-driver-535 nvidia-cuda-toolkit

# Install Ollama and Start Ollama service
curl -fsSL https://ollama.com/install.sh | sh
ollama serve

3. Setup Backend

cd backend

# Create virtual environment
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Configure environment
cp .env.example .env
# Edit .env with your MySQL credentials and configuration

4. Setup Frontend

cd ../frontend

# Install dependencies
npm install

# Configure environment
cp .env.local.example .env.local
# Edit .env.local with your API URL

5. Database Setup

# Start MySQL
sudo systemctl start mysql

# Create database
mysql -u root -p
CREATE DATABASE inference_db;
EXIT;

6. Start Services

# Terminal 1: Start Backend
cd backend
python main.py

# Terminal 2: Start Frontend
cd frontend
npm run dev

# Terminal 3: Start Audio API (optional)
cd ExpertAgentC_LLMServer_Nursing_Note_STT
./start_api.sh

7. Access Application

Default Login:

  • Username: admin
  • Password: password

Method 2: Component-by-Component Setup

If you prefer to set up individual components, follow these guides:

πŸ“Š Data Preprocessing

cd Data_Preprocessing
pip install rich psutil
python PrivNurse_data_preprocessing.py

πŸ“– Detailed Guide

🧠 Training Data Generation

cd Training_Data_Distillation
# Generate consultation datasets with Claude
python privNurse_consult_validation_claude.py
python privNurse_consult_summary_claude.py

# Generate discharge datasets with MedGemma
jupyter notebook PrivNurse_note_validation_medgemma.ipynb
jupyter notebook PrivNurse_note_summary_medgemma.ipynb

πŸ“– Detailed Guide

🎯 Model Fine-tuning

cd FineTuning_Training
# Fine-tune all four specialized models
jupyter notebook FineTuning_Gemma3n_PrivNurse_consult_summary.ipynb
jupyter notebook FineTuning_Gemma3n_PrivNurse_consult_validation.ipynb
jupyter notebook FineTuning_Gemma3n_PrivNurse_note_summary.ipynb
jupyter notebook FineTuning_Gemma3n_PrivNurse_note_validation.ipynb

πŸ“– Detailed Guide

🎀 Speech-to-Text Server

cd ExpertAgentC_LLMServer_Nursing_Note_STT
chmod +x setup.sh
./setup.sh
cd gemma-audio-api
./start_api.sh

πŸ“– Detailed Guide

πŸ–₯️ Main Application

cd privnurse_gemma3n
# Follow backend and frontend setup as described above

πŸ“– Detailed Guide


🎯 Training Your Own Models

Prerequisites

  • Access to de-identified medical records
  • Institutional Review Board (IRB) approval
  • Computational resources (GPU recommended)
  • Hugging Face access token

Step 1: Data Preparation

cd Data_Preprocessing
python PrivNurse_data_preprocessing.py

Step 2: Generate Training Datasets

cd Training_Data_Distillation

# For consultation tasks (requires Claude API access)
python privNurse_consult_validation_claude.py
python privNurse_consult_summary_claude.py

# For discharge tasks (requires MedGemma access)
jupyter notebook PrivNurse_note_validation_medgemma.ipynb
jupyter notebook PrivNurse_note_summary_medgemma.ipynb

Step 3: Fine-tune Models

cd FineTuning_Training

# Train all four specialized models
jupyter notebook FineTuning_Gemma3n_PrivNurse_consult_summary.ipynb
jupyter notebook FineTuning_Gemma3n_PrivNurse_consult_validation.ipynb
jupyter notebook FineTuning_Gemma3n_PrivNurse_note_summary.ipynb
jupyter notebook FineTuning_Gemma3n_PrivNurse_note_validation.ipynb

Step 4: Deploy to Ollama

# Import trained models
ollama create gemma-3n-privnurse-consult-summary-v1 -f Modelfile_PrivNurse_Consultation_Summary_v1
ollama create gemma-3n-privnurse-consult-validation-v1 -f Modelfile_PrivNurse_Consultation_Validation_v1
ollama create gemma-3n-privnurse-note-summary-v1 -f Modelfile_PrivNurse_DischargeNote_Summary_v1
ollama create gemma-3n-privnurse-note-validation-v1 -f Modelfile_PrivNurse_DischargeNote_Validation_v1

πŸ”’ Security & Privacy

HIPAA/GDPR Compliance

  • βœ… On-device Processing: All AI inference occurs locally
  • βœ… Data De-identification: Complete PII removal following Safe Harbor guidelines
  • βœ… Encrypted Storage: All patient data encrypted at rest
  • βœ… Audit Trails: Comprehensive logging of all system interactions
  • βœ… Access Controls: Role-based permissions and authentication

Network Security

  • πŸ” Firewall Configuration: Restricted port access
  • πŸ” API Authentication: Bearer token security
  • πŸ” Rate Limiting: Protection against abuse
  • πŸ” HTTPS Support: Encrypted communications

πŸ§ͺ Testing & Validation

Running Tests

# Backend API tests
cd backend
python -m pytest tests/

# Frontend component tests
cd frontend
npm test

# Audio API tests
cd ExpertAgentC_LLMServer_Nursing_Note_STT
python test_api.py

Performance Benchmarks

# System resource monitoring
htop
nvidia-smi

# API response time testing
curl -w "@curl-format.txt" -o /dev/null -s http://localhost:8000/health

🚨 Troubleshooting

Common Issues and Solutions

Database Connection Problems

# Check MySQL status
sudo systemctl status mysql

# Reset MySQL password
sudo mysql_secure_installation

# Verify database exists
mysql -u root -p -e "SHOW DATABASES;"

Ollama Model Issues

# Check available models
ollama list

# Re-create models if corrupted
ollama create gemma-3n-privnurse-note-summary-v1 -f Modelfile_PrivNurse_DischargeNote_Summary_v1

# Verify Ollama service
curl http://localhost:11434/api/tags

GPU Memory Issues

# Check GPU usage
nvidia-smi

# Clear GPU memory
sudo fuser -v /dev/nvidia*
sudo kill -9 <PID>

Audio Processing Problems

# Verify FFmpeg installation
ffmpeg -version

# Check audio file permissions
ls -la /path/to/audio/files

# Test audio processing
curl -X POST -F "audio_file=@test.wav" http://localhost:8444/generate/audio-text

πŸ“Š Performance Optimization

Model Optimization

# Enable 8-bit quantization
export CUDA_VISIBLE_DEVICES=0
export PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:512

Database Optimization

-- Optimize MySQL for better performance
SET GLOBAL innodb_buffer_pool_size = 2G;
SET GLOBAL query_cache_size = 268435456;
CREATE INDEX idx_patient_id ON consultations(patient_id);

System Tuning

# Increase file descriptor limits
echo "* soft nofile 65536" | sudo tee -a /etc/security/limits.conf
echo "* hard nofile 65536" | sudo tee -a /etc/security/limits.conf

# Optimize kernel parameters
echo "net.core.somaxconn = 65535" | sudo tee -a /etc/sysctl.conf
sudo sysctl -p

πŸ“š Documentation

Complete Documentation Set

API Documentation


🀝 Contributing

We welcome contributions from the healthcare AI community!

Getting Started

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Development Guidelines

  • Follow existing code style and conventions
  • Add tests for new functionality
  • Update documentation for API changes
  • Ensure HIPAA compliance for any healthcare-related features

Areas for Contribution

  • 🧠 Additional clinical specialties (cardiology, oncology, etc.)
  • 🌐 Multi-language support
  • πŸ“± Mobile application development
  • πŸ”§ Performance optimizations
  • πŸ§ͺ Additional test coverage

πŸ“„ License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Third-Party Licenses


πŸ™ Acknowledgments

Clinical Partners

  • Kuang Tien General Hospital - Clinical deployment and validation
  • IRB Committee - Ethical oversight and approval (IRB no.: KTGH 1135)

Open Source Community

  • FastAPI, Next.js, React, and all open-source contributors
  • The healthcare AI research community

πŸ“ž Support & Contact

Documentation & Issues

Research Collaboration

For research collaborations, clinical partnerships, or enterprise deployment inquiries:

Citation

If you use PrivNurse AI in your research, please cite:

@misc{wen2025privnurse,
  title={PrivNurse AI: Revolutionizing Clinical Documentation with On-Device Intelligence},
  author={Wei-Lin Wen and Yu-Yao Tsai},
  year={2025},
  url={https://github.com/weilin1205/PrivNurseAI}
}

PrivNurse AI - Transforming Healthcare Documentation, One Note at a Time πŸ₯✨

Star this repo Fork this repo Watch this repo

Empowering healthcare professionals with secure, explainable AI that keeps patient data private while revolutionizing clinical workflows.