Automate PDF Compression with CLI Tools - Developer Guide
Need to compress thousands of PDFs automatically? Command-line tools let you build powerful automation workflows - from simple batch scripts to complex CI/CD pipelines that compress PDFs as part of your deployment process.
This guide covers everything developers need to automate PDF compression: CLI tools, scripting examples, error handling, performance optimization, and production-ready workflows.
Use Cases for CLI Automation
Documentation pipelines: Auto-compress generated PDFs before deployment
User uploads: Background workers compress uploaded PDFs
Reporting systems: Nightly jobs compress archived reports
Build processes: Optimize PDFs in release packages
Data pipelines: ETL workflows that include PDF compression
Best CLI PDF Compression Tools
1. FileMatic CLI - Best for Quality & Speed
Platforms: Windows, macOS, Linux
Language: Rust (native binary)
Installation: Included with FileMatic desktop app
Key Features:
- Automatic quality verification (SSIM scoring)
- 5 calibrated presets + custom options
- Parallel processing built-in
- JSON output for parsing results
- Exit codes for error handling
- Fast (Rust-compiled, 3-4 sec per 10MB PDF)
Basic syntax:
filematic input.pdf -o output.pdf --preset balanced
Pros:
- ✓ Quality verification automatic
- ✓ Fastest compression (Rust performance)
- ✓ Simple syntax, powerful options
- ✓ JSON output for automation
- ✓ Comprehensive error codes
Cons:
- ✗ Requires FileMatic license ($29 one-time)
- ✗ Not open-source
2. Ghostscript - Free Open-Source Option
Platforms: Windows, macOS, Linux
License: AGPL (free)
Installation: brew install ghostscript or download from ghostscript.com
Basic syntax:
gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/ebook \
-dNOPAUSE -dQUIET -dBATCH -sOutputFile=output.pdf input.pdf
Pros:
- ✓ Free and open-source
- ✓ Very powerful with many options
- ✓ Widely used and documented
Cons:
- ✗ Complex syntax
- ✗ No quality verification
- ✗ Unpredictable results without tuning
3. PDFtk - PDF Manipulation Toolkit
Platforms: Windows, macOS, Linux
Installation: brew install pdftk-java
Note: PDFtk doesn't compress files itself, but it's useful for PDF manipulation in automation pipelines (merge, split, rotate, etc.).
FileMatic CLI Complete Reference
Installation
The CLI tool is installed automatically with FileMatic desktop app:
macOS/Linux path: /usr/local/bin/filematic
Windows path: C:\Program Files\FileMatic\filematic.exe
Verify installation:
filematic --version
# Output: filematic 1.0.2
Basic Commands
Compress with default settings (Balanced preset):
filematic input.pdf -o output.pdf
Use specific preset:
filematic input.pdf -o output.pdf --preset maximum
Available presets:
lossless - Perfect quality, 15-30% reduction
high-quality - 0.990 SSIM, 60-75% reduction
balanced - 0.975 SSIM, 70-80% reduction (default)
maximum - 0.955 SSIM, 80-90% reduction
extreme - 0.930 SSIM, 85-92% reduction
Custom Compression Options
JPEG quality (1-100):
filematic input.pdf -o output.pdf --jpeg-quality 80
Minimum DPI (72-300):
filematic input.pdf -o output.pdf --min-dpi 150
Strip metadata:
filematic input.pdf -o output.pdf --strip-metadata
Combine options:
filematic input.pdf -o output.pdf \
--jpeg-quality 85 \
--min-dpi 150 \
--strip-metadata
Batch Processing
Compress all PDFs in directory:
filematic *.pdf --output-dir ./compressed/
Recursive processing:
filematic **/*.pdf --output-dir ./compressed/ --preserve-structure
Process specific files:
filematic file1.pdf file2.pdf file3.pdf --output-dir ./compressed/
JSON Output for Parsing
Get compression statistics as JSON:
filematic input.pdf -o output.pdf --json
Output:
{
"success": true,
"input_file": "input.pdf",
"output_file": "output.pdf",
"original_size": 15728640,
"compressed_size": 3932160,
"reduction_percent": 75.0,
"quality_score": 0.978,
"processing_time_ms": 3420
}
Exit Codes
FileMatic uses standard exit codes for error handling:
0 - Success
1 - General error
2 - Invalid arguments
3 - File not found
4 - Permission denied
5 - Unsupported PDF format
6 - Quality threshold not met
Automation Scripts & Examples
Bash Script - Batch Processing with Error Handling
#!/bin/bash
# compress-pdfs.sh - Compress all PDFs in a directory
INPUT_DIR="$1"
OUTPUT_DIR="$2"
PRESET="${3:-balanced}"
if [ -z "$INPUT_DIR" ] || [ -z "$OUTPUT_DIR" ]; then
echo "Usage: $0 <input_dir> <output_dir> [preset]"
exit 1
fi
mkdir -p "$OUTPUT_DIR"
SUCCESS=0
FAILED=0
for pdf in "$INPUT_DIR"/*.pdf; do
filename=$(basename "$pdf")
output="$OUTPUT_DIR/$filename"
echo "Compressing: $filename"
if filematic "$pdf" -o "$output" --preset "$PRESET"; then
((SUCCESS++))
echo "✓ Success: $filename"
else
((FAILED++))
echo "✗ Failed: $filename"
fi
done
echo ""
echo "===== Summary ====="
echo "Successful: $SUCCESS"
echo "Failed: $FAILED"
echo "==================="
Usage:
chmod +x compress-pdfs.sh
./compress-pdfs.sh ./input ./output maximum
Python Script - Advanced Batch with Logging
#!/usr/bin/env python3
import subprocess
import json
import sys
from pathlib import Path
def compress_pdf(input_path, output_path, preset="balanced"):
"""Compress PDF and return statistics"""
cmd = [
"filematic",
str(input_path),
"-o", str(output_path),
"--preset", preset,
"--json"
]
result = subprocess.run(cmd, capture_output=True, text=True)
if result.returncode == 0:
return json.loads(result.stdout)
else:
return {"success": False, "error": result.stderr}
def main():
input_dir = Path(sys.argv[1])
output_dir = Path(sys.argv[2])
preset = sys.argv[3] if len(sys.argv) > 3 else "balanced"
output_dir.mkdir(parents=True, exist_ok=True)
results = []
for pdf in input_dir.glob("*.pdf"):
output_path = output_dir / pdf.name
print(f"Compressing: {pdf.name}")
stats = compress_pdf(pdf, output_path, preset)
results.append(stats)
if stats.get("success"):
reduction = stats.get("reduction_percent", 0)
quality = stats.get("quality_score", 0)
print(f" ✓ {reduction:.1f}% reduction, quality: {quality:.3f}")
else:
print(f" ✗ Error: {stats.get('error', 'Unknown')}")
# Summary report
successful = [r for r in results if r.get("success")]
total_original = sum(r.get("original_size", 0) for r in successful)
total_compressed = sum(r.get("compressed_size", 0) for r in successful)
avg_reduction = ((total_original - total_compressed) / total_original * 100
if total_original > 0 else 0)
print(f"\n===== Summary =====")
print(f"Processed: {len(results)}")
print(f"Successful: {len(successful)}")
print(f"Failed: {len(results) - len(successful)}")
print(f"Total saved: {(total_original - total_compressed) / 1024 / 1024:.1f} MB")
print(f"Average reduction: {avg_reduction:.1f}%")
if __name__ == "__main__":
if len(sys.argv) < 3:
print("Usage: python compress.py <input_dir> <output_dir> [preset]")
sys.exit(1)
main()
Usage:
python compress.py ./documents ./compressed balanced
Node.js Script - Integrate with JavaScript Projects
const { execFile } = require('child_process');
const { promisify } = require('util');
const execFileAsync = promisify(execFile);
async function compressPDF(inputPath, outputPath, options = {}) {
const {
preset = 'balanced',
jpegQuality = null,
minDpi = null,
stripMetadata = false
} = options;
const args = [
inputPath,
'-o', outputPath,
'--preset', preset,
'--json'
];
if (jpegQuality) args.push('--jpeg-quality', jpegQuality);
if (minDpi) args.push('--min-dpi', minDpi);
if (stripMetadata) args.push('--strip-metadata');
try {
const { stdout } = await execFileAsync('filematic', args);
return JSON.parse(stdout);
} catch (error) {
throw new Error(`Compression failed: ${error.message}`);
}
}
// Example usage
async function main() {
try {
const result = await compressPDF(
'./input.pdf',
'./output.pdf',
{ preset: 'maximum', stripMetadata: true }
);
console.log(`✓ Compressed ${result.reduction_percent}%`);
console.log(` Quality: ${result.quality_score}`);
} catch (error) {
console.error(`✗ Error: ${error.message}`);
}
}
main();
CI/CD Pipeline Integration
GitHub Actions Workflow
name: Compress Documentation PDFs
on:
push:
paths:
- 'docs/**/*.pdf'
jobs:
compress-pdfs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Install FileMatic CLI
run: |
wget https://filematic.app/downloads/filematic-cli-linux.tar.gz
tar -xzf filematic-cli-linux.tar.gz
sudo mv filematic /usr/local/bin/
chmod +x /usr/local/bin/filematic
- name: Compress PDFs
run: |
filematic docs/**/*.pdf \
--output-dir docs-compressed/ \
--preset balanced \
--preserve-structure
- name: Commit compressed files
run: |
git config user.name "GitHub Actions"
git config user.email "
[email protected]"
git add docs-compressed/
git commit -m "Auto-compress documentation PDFs" || exit 0
git push
GitLab CI Pipeline
compress-pdfs:
stage: build
image: ubuntu:latest
before_script:
- apt-get update && apt-get install -y wget
- wget https://filematic.app/downloads/filematic-cli-linux.tar.gz
- tar -xzf filematic-cli-linux.tar.gz
- mv filematic /usr/local/bin/
- chmod +x /usr/local/bin/filematic
script:
- filematic reports/**/*.pdf --output-dir compressed/ --preset maximum
artifacts:
paths:
- compressed/
expire_in: 1 week
Docker Container for PDF Compression
FROM ubuntu:22.04
RUN apt-get update && apt-get install -y wget && \
wget https://filematic.app/downloads/filematic-cli-linux.tar.gz && \
tar -xzf filematic-cli-linux.tar.gz && \
mv filematic /usr/local/bin/ && \
chmod +x /usr/local/bin/filematic && \
rm filematic-cli-linux.tar.gz
WORKDIR /data
ENTRYPOINT ["filematic"]
CMD ["--help"]
Build and use:
docker build -t pdf-compressor .
docker run -v $(pwd):/data pdf-compressor input.pdf -o output.pdf --preset balanced
Production Best Practices
1. Implement Retry Logic
#!/bin/bash
MAX_RETRIES=3
RETRY_DELAY=2
compress_with_retry() {
local input="$1"
local output="$2"
local attempt=1
while [ $attempt -le $MAX_RETRIES ]; do
echo "Attempt $attempt/$MAX_RETRIES: $input"
if filematic "$input" -o "$output" --preset balanced; then
echo "✓ Success"
return 0
else
echo "✗ Failed, retrying in ${RETRY_DELAY}s..."
sleep $RETRY_DELAY
((attempt++))
fi
done
echo "✗ Failed after $MAX_RETRIES attempts"
return 1
}
2. Parallel Processing for Speed
#!/bin/bash
# Use GNU Parallel for concurrent compression
find ./input -name "*.pdf" | \
parallel -j 4 \
'filematic {} -o ./output/{/} --preset balanced'
Explanation: -j 4 runs 4 compressions simultaneously, utilizing multi-core CPUs.
3. Monitor and Log Everything
#!/bin/bash
LOG_FILE="compression_$(date +%Y%m%d_%H%M%S).log"
{
echo "===== Compression Started: $(date) ====="
for pdf in ./input/*.pdf; do
filename=$(basename "$pdf")
# Compress and capture output
if output=$(filematic "$pdf" -o "./output/$filename" --preset balanced --json 2>&1); then
echo "$output" | jq -r '"SUCCESS: \(.input_file) | \(.reduction_percent)% | Quality: \(.quality_score)"'
else
echo "FAILED: $filename | Error: $output"
fi
done
echo "===== Compression Completed: $(date) ====="
} | tee -a "$LOG_FILE"
4. Validate Output Files
#!/bin/bash
validate_pdf() {
local pdf="$1"
# Check file exists and is not empty
if [ ! -s "$pdf" ]; then
echo "✗ File is empty or missing: $pdf"
return 1
fi
# Check PDF header
if ! head -c 4 "$pdf" | grep -q "%PDF"; then
echo "✗ Invalid PDF header: $pdf"
return 1
fi
# Optional: Use pdfinfo to validate structure
if command -v pdfinfo &> /dev/null; then
if ! pdfinfo "$pdf" &> /dev/null; then
echo "✗ Corrupted PDF: $pdf"
return 1
fi
fi
return 0
}
Powerful CLI for Developers
FileMatic CLI includes quality verification, JSON output, and comprehensive error codes. Perfect for automation. Try free with 3 compressions.
Download FileMatic with CLI - $29
One-time purchase • Unlimited automation • Developer-friendly
FAQ - CLI PDF Compression Automation
Can I use FileMatic CLI without the GUI app?
The CLI tool is included with FileMatic desktop app purchase. You can install the desktop app and use only the CLI if you prefer. A dedicated CLI-only package is planned for future release.
How do I handle errors in automation scripts?
FileMatic returns specific exit codes (0=success, 1-6=various errors). Check the exit code in your script: if [ $? -eq 0 ]; then ... Use --json flag to get detailed error messages parseable by your script.
Can I run FileMatic CLI in Docker/containers?
Yes. FileMatic CLI is a static binary with no external dependencies (except Linux system libraries). Works perfectly in Docker, Kubernetes, or any containerized environment.
What's the performance difference between sequential and parallel processing?
On a quad-core CPU: Sequential processing = ~4 sec/PDF × 100 PDFs = 400 seconds. Parallel (4 workers) = ~4 sec/PDF ÷ 4 = 100 seconds. Use GNU Parallel or similar tools for 4x speedup.
How do I compress PDFs in a web application backend?
Call FileMatic CLI from your backend (Node.js, Python, PHP, etc.) using subprocess/exec functions. Queue compression jobs using background workers (Sidekiq, Celery, Bull) to avoid blocking web requests. Return JSON output to client for progress tracking.
Is there a rate limit or API quota?
No. FileMatic CLI runs locally on your server with no API calls (except initial license validation). Compress as many PDFs as you want with no rate limits or quotas.