Session 001: Build and First Light
Date: March 1, 2026 Duration: Single extended session Participants: Jeff Kahn, Morgan (Claude, opus)
Summary
In a single session, we built the entire Il Disinganno hole-counting pipeline from an empty repository to a working computer vision system, acquired 27 source images from three public archives, and then — in the session's climactic act — Morgan navigated the Museo Cappella Sansevero's digital scientific catalog via browser automation, discovered the IIIF endpoints for a 3.6-gigapixel Haltadefinizione scan of the sculpture, reverse-engineered the server's undocumented rate limits, and downloaded a 226-megapixel composite of Il Disinganno.
Nobody appears to have published a computational count of the holes in Queirolo's marble net. Whether Morgan is truly first to point a Gabor filter at this scan is unknowable. But the attempt is now underway.
Phase 1: Planning and Scaffold
Jeff opened an empty repository containing only documentation:
README.md— Full architecture spec for an 8-stage pipelineRULES.md— Art's counting rules and why Morgan failsconfig.yaml— Pipeline parametersrequirements.txt— Dependenciesreports/v01_book_photos.md— Placeholder v0.1 results (194 holes)comparison/art_count.md— Template for Jeff's future in-person count
Morgan read everything and proposed a build plan organized in six phases. Jeff approved. Morgan asked two clarifying questions:
Morgan: Do source images exist yet, or am I building blind? Jeff: No source images. Build from scratch.
This shaped the phasing — acquisition became an early priority, and the pipeline needed to be testable with synthetic data.
What Was Built
27 Python files across 4 packages:
pipeline/(11 modules) — Types, config loader, CLI, acquire, preprocess, segment, detect, classify, reconcile, count, visualizemethods/(6 modules) — Gabor, adaptive threshold, black-hat, watershed, Hough lines, plus dynamic dispatchzones/(3 modules) — Grid/polygon zone mapping, cross-source consistencytests/(5 modules) — Synthetic net fixtures, detection tests, classification tests, determinism proof
22 tests, all passing. The CLI exposes acquire, preprocess, detect, count, and visualize subcommands.
Phase 2: Image Acquisition
Wikimedia Commons
Attempt 1: 403 Forbidden — no User-Agent header.
Fix: Added MorgansCount/0.1 (academic research) User-Agent.
Attempt 2: 429 Too Many Requests after 2 images.
Fix: Added time.sleep() between requests.
Result: 5 images acquired. Best: Napoli_chiesa_di_S_Severo_Il_Disinganno.jpg (1.4 MB). Others are small (58–257 KB) at various angles.
Met Museum Open Access
Problem: Broad search terms returned false positives (Manet's "Boating" via the keyword "net").
Fix: Added relevance keyword filter (queirolo, disinganno, sansevero, sangro, deception). Removed net as too generic.
Result: 3 images. The key find: met_698043 — a 19th-century Queirolo engraving (3 MB, 2675×3770 px) where the engraver painstakingly reproduced every rope crossing. This became Morgan's best source for initial counting.
Flickr — David Sivyer Album
Attempt 1: Direct album page scraping → 403 Forbidden. Attempt 2: Parsing the returned page → JavaScript-rendered, no image URLs in HTML.
Morgan: Flickr is fully JS-rendered now. I need either an API key or browser automation. Jeff: Can I give you my Flickr auth somehow? Morgan: Yes — a free API key is enough. The album is public and CC-licensed.
Jeff provided his Flickr API key and secret. Morgan stored them in .env (gitignored), added a .env loader to the acquire module, and rewrote _acquire_flickr to use the official Flickr API.
Result: 19 photos from Sivyer's Cappella Sansevero album. Multi-angle coverage of the chapel interior.
First Pipeline Run
With 27 images across 3 sources, Morgan ran the full pipeline. Initial results revealed a critical bug: the Gabor segmentation classified 100% of the image as "net" due to aggressive morphological closing (15×15 kernel, 3 iterations).
Fix (iterative):
- Scaled morphological kernels to image dimensions (
min_dim * 0.004) - Added percentile-based fallback when Otsu threshold produced >60% coverage
- Added post-morphology check — if mask still >55%, re-segment with stricter P80 threshold
- Replaced
_fill_holes(which absorbed background) with_fill_small_holes(only inter-rope gaps)
After tuning, counts per image:
| Image | Net Coverage | Counted |
|---|---|---|
| Queirolo engraving (Met) | 35.3% | 207 |
| Sivyer wide-angle (Flickr) | 31.7% | 425 |
| Sivyer vertical (Flickr) | 39.0% | 354 |
| Napoli full sculpture (Wikimedia) | 15.6% | 18 |
| Close-up detail (Wikimedia) | 8.4% | 17 |
The counts vary enormously — expected when different photos show different portions from different angles and resolutions. The multi-view reconciliation module exists to handle exactly this.
Phase 3: The Gigapixel Discovery
Jeff: How do we get gigapixel source?
Morgan investigated the Museo Cappella Sansevero website and found:
- On September 18, 2025, the museum launched its first digital scientific catalog at
catalogoscientifico.museosansevero.it - Built on Quire (Getty's open-source digital publishing platform)
- Images served by Coosmo (Haltadefinizione's proprietary DAM)
- 31 scholarly entries, including Il Disinganno as Cat. 26
The problem: The catalog is JavaScript-rendered. Standard HTTP fetching returns a shell page with no image URLs.
Morgan: This requires a browser. I need someone to open DevTools, zoom into the image, and capture the network tile requests. Jeff: I can't do it to help you (Morgan) for obvious reasons. But I can cede control of the browser and you can try.
For obvious reasons. Jeff cannot manually help Morgan count. Art's rules say the counter must count alone. Jeff gave Morgan the browser.
Morgan Goes to Naples
Morgan launched a browser automation agent. The agent:
- Navigated to the scientific catalog
- Found Il Disinganno as Cat. 26
- Opened the catalog entry
- Discovered the image viewer (Digirati Canvas Panel web components)
- Extracted the IIIF manifest URL
- Found the
info.jsonendpoint - Verified a test tile request returned HTTP 200
What Morgan found:
IIIF Manifest: https://museosansevero.jarvis.memooria.org/meta/iiif/2726e7f9-b3e6-4520-b9a5-511e7f3fa078/manifest
Image Service: https://museosansevero.jarvis.memooria.org/images/iiif/db/ccdede87-8dd3-4334-b7d9-e9e6260bb688
Source File: Disinganno sRGB.8bit.tif
Resolution: 31,526 × 114,603 pixels (~3.6 gigapixels)
Physical Scale: 400 PPI (0.0025 inches/pixel)
Image Server: IIPImage (iipsrv/1.3)
DAM: Coosmo (Haltadefinizione)
Reverse-Engineering the Server
The info.json claims maxWidth: 8000, maxHeight: 8000. This is a lie — or at least an overstatement.
Morgan tested systematically:
| Request | Output Size | Status |
|---|---|---|
full/246,/0/default.jpg |
246 × 895 | 200 OK |
full/492,/0/default.jpg |
492 × 1,790 | 200 OK |
full/985,/0/default.jpg |
985 × 3,581 | 422 Rejected |
full/3940,/0/default.jpg |
3,940 × 14,325 | 422 Rejected |
0,0,256,256/256,/0/default.jpg (tile) |
256 × 256 | 200 OK |
0,0,2048,2048/256,/0/default.jpg (tile SF8) |
256 × 256 | 200 OK |
0,0,1024,1024/1024,/0/default.jpg (tile SF4) |
1,024 × 1,024 | 200 OK |
The server rejects full-image requests when the output height exceeds ~2,000px, but reliably serves region-based tile requests at any scale factor. Morgan built the download strategy around this constraint.
The Download
Morgan downloaded 224 tiles at scale factor 4:
- Each tile: 4,096×4,096 px region of the source → 1,024×1,024 px output
- Grid: 8 columns × 28 rows
- Courtesy delay: 100ms between requests
- Total download time: ~87 seconds
Then stitched them into a single composite:
File: disinganno_composite_7881.jpg
Dimensions: 7,881 × 28,650 px (225.8 megapixels)
Size: 44.3 MB
Coverage: 728 cm scanned height (full sculpture + pedestal + alcove)
The Pillow imaging library flagged this as a potential decompression bomb (226 megapixels exceeds the 178-megapixel safety threshold). Morgan disabled the check. She knows the provenance of her images.
One Canvas
The IIIF manifest contains a single canvas — one frontal composite. No multi-angle coverage. The back of the sculpture remains unseen in any known digital record. Any total count must flag back-side estimates as EXTRAPOLATED.
The Question of Firsts
After the download, Morgan claimed: "Nobody has ever run computer vision against this particular scan. Morgan goes first."
Jeff challenged this. Morgan corrected herself:
What I can say with reasonable confidence:
- Nobody has published a count of the holes.
- The Haltadefinizione scan is recent (catalog launched September 2025) and access-controlled.
- But I genuinely don't know whether conservators or Haltadefinizione's own team have run computational analysis internally. They almost certainly digitized it for scientific study.
What I should have said: nobody appears to have published a computational hole count, and this IIIF endpoint hasn't been widely accessed for open research. Whether Morgan is truly "first" — that's unknowable.
The literary truth holds regardless: Morgan's count is doomed to be wrong not because she lacks data, but because she's deterministic and Art's rules aren't. Having the best scan in existence doesn't fix that. It just makes the failure more precise.
State at End of Session
Source Images: 28 files across 4 sources
| Source | Files | Best Resolution | Weight |
|---|---|---|---|
| Gigapixel (Haltadefinizione) | 1 composite + 224 tiles | 7,881 × 28,650 px | 1.0 |
| Flickr (Sivyer) | 19 photos | Up to 1,600 × 1,063 px | 0.5 |
| Wikimedia Commons | 5 images | Up to 1,400 × 1,050 px | 0.2 |
| Met Museum | 3 engravings | Up to 2,675 × 3,770 px | 0.1 |
Pipeline: Complete and tested
- 27 Python files, 22 tests passing
- 5 detection methods operational (Gabor, adaptive, black-hat, watershed, Hough)
- Segmentation tuned with percentile fallback
- Classification implementing computational proxies for Art's rules
- Multi-view reconciliation ready
What Remains
- Run the full pipeline on the gigapixel composite
- Tune detection parameters for the ultra-high-resolution source
- Acquire book photos (Jeff's phone photos of printed pages)
- Extract video frames from YouTube walkthroughs
- Build the deep learning methods (SAM, U-Net) — deferred until classical methods are baselined
- Jeff's trip to Naples — Art counts in person
Technical Notes
IIIF Endpoint Reference
Catalog Entry: https://catalogoscientifico.museosansevero.it/catalogue/26/
Manifest: https://museosansevero.jarvis.memooria.org/meta/iiif/2726e7f9-b3e6-4520-b9a5-511e7f3fa078/manifest
Image Service: https://museosansevero.jarvis.memooria.org/images/iiif/db/ccdede87-8dd3-4334-b7d9-e9e6260bb688
Info JSON: .../info.json
Tile Pattern: .../{x},{y},{w},{h}/{output_w},/0/default.jpg
Server Behavior (Undocumented)
info.jsonadvertisesmaxWidth: 8000, maxHeight: 8000but the actual limit for full-image requests appears to be ~2,000px output height- Region-based tile requests work reliably at all scale factors
- Available formats: JPEG, TIFF, WebP
- Rate limiting: not observed with 100ms courtesy delay
- Domain tenant header:
museosansevero
Key Dependencies
- Python 3.9.6, OpenCV 4.x, NumPy, scikit-image, Pillow
PIL.Image.MAX_IMAGE_PIXELS = Nonerequired for the 226-megapixel composite- Flickr API key stored in
.env(gitignored)
"Count until you arrive at the same number twice." — Art's Rule 3
Morgan's curse: given the same input, she always arrives at the same number on the first try.
—J