I love this! I tried to apply the same idea to scan the tallest tree in New England with a drone. It didn't come out great, but I might just try again now.
I have been creating animations using a similar process but with a regular camera and manually splicing the frames together. [1,2,3] The effect is quite interesting in how it forces focus on the subject reducing the background into an abstract pattern. Each 'line' is around 15px wide.
I also shot a timelapse of the Tokyo skyline at sunset and applied a similar process [4], then motion tracked it so that time is traveling across the frame from left to right[5]. Each line here is 4 pixels wide and the original animation is in 8k.
Does anyone know what it looks like when you use a line scan camera to take a picture of the landscape from a moving car or train? I suspect the parallax produces some interesting distortions.
Sorry for the purple trees. The camera is sensitive to near infrared, in which trees are highly reflective, and I haven't taken any trains since buying an IR cut filter. Some of these also have dropped frames and other artifacts.
Iirc, at the last Olympics, Omega paired a high-frequency linear display with their finish-line strip cameras. Regular cameras saw a flashing line, but the backdrop to photo-finishes was an Omega logo. Very subtle, but impressive to pull off.
IMO the denoising looks rather unnatural and emphasizes the remaining artifacts, especially color fringe around details. Personally I'd leave that turned off. Also, with respect to the demosaic step, I wonder if it's possible to implement a version of RCD [1] for improved resolution without the artifacts that seem to result from the current process.
Yeah I actually have it disabled by default since it makes the horizontal stripes more obvious and it's also extremely slow. Also, I found that my vertical stripe correction doesn't work in all cases and sometimes introduces more stripes. Lots more work to do.
As for RCD demosaicing, that's my next step. The color fringing is due to the naive linear interpolation for the red and blue channels. But, with the RCD strategy, if we consider that the green channel has full coverage of the image, we could use it as a guide to make interpolation better.
When you do the demosaicing, and perhaps other steps, did you ever consider declaring the x-positions, spline parameters, ... as latent variables to estimate?
Consider a color histogram, then the logo (showing color oscillations) would have a wider spread and lower peaked histogram versus a correctly mapped (just the few colors plus or minus some noise) which would show a very thin but strong peak in colorspace. A a high-variance color occupation has higher entropy compared to a low-variance strongly centered peak (or multipeak) distribution.
So it seems colorspace entropy could be a strong term in a loss function for optimization (using RMAD).
It's neat that it captured the shadow of the subway train, too, which arrived just ahead of the train itself. This virtual shadow is thrown against a sort of extruded tube with the profile of the slice of track and wall that the slit was pointed at.
Anyone know of a steam train captured in the same way? I'm interested in the effect of the parts with vertical motion such as the pistons and steam clouds, combined with the largely static body.
I looked into line cameras for a project. I think their main application is in quality control of food on conveyer belts. There are plenty of automated sorting systems that can become a bottleneck. One of the units I speced out could record an 8k pixel line at up to 40kfps.
They are used in OCT (optical coherence tomography) as well
OCT is a technique which uses IR to get "through" tissue using beam in the near infrared (roughly 950 nm, with a spread of roughly 100 nm). The return is passed through interferometer and what amounts to a diffraction grating to produce the "spread" that the line camera sees. After some signal processing (FFT is a big one), you can get the intensity at depth. If you sweep in X,Y somehow, usually deflecting the beam with a mirror, you can obtain a volumetric image like an MRI or sonogram. Very useful for imaging the eye, particularly the back of the retina where the blood vessels are.
Yah, lots of neat line scan camera applications in spectroscopy. Basically any grating application. 950nm would be on the edge of where you'd implement a Si CCD for OCT as the sensitivity drops as the Si is no longer absorbing. InGaAs detectors are used further in the NIR.
A number of the sats I worked with are single point cameras .. the satellite spins about a major axis orientated in the direction of travel, the point camera rotates with the satellite and a series of points of data are written to a line of storage as the camera points at the earth and pans across as the sat also moves forward.
Data stops being written as the sat rotates the camera away from the planet and resumes once it has rolled over enough to again point at the earth.
It may seem like a pedantic difference; a "line scan camera" is stationary while mirrors inside it spin or another mechanism causes it to "scan" a complete vertical line - perhaps all at once, perhaps as the focal point moves Vs a camera in a satellite that has no moving parts that just records a single point directly in front of the instrument .. and the entire satellite spins and moves forwards.
The video [https://www.magyaradam.com/wp/?page_id=806] blew my mind. I can only image he reconstructed the video by first reconstructing one frame's worth of slits — then shifting them over by one column and adding the next slit data.
None of the shots in that video are using Slit Scan technique. It’s using a technique called Mean Stack Mode to get the average pixel value across multiple frames, over a rolling selection of an input video.
Super cool. I wonder if you could re-use a regular 2-d CMOS digital camera sensor to the same effect. But now I realize your sensor is basically 1-D and has a 95khz sampling rate. At the same rate with a 4k sensor you'd have way too much data to store and would need to throw most of it away.
Pretty sure could do it but it would be very expensive, because you'd need alot more very fast ADCs.
Like if the camera is $5k, in order to get that exposure time in full-field you would need to duplciate the hardware 800 times or whatever you wanted horizontal resolution to be. Thats alot of zeros for a single camera
Fun read! I used to work in sensor calibration, and most people take for granted how much engineering went into having phones taking good photos. There’s a nontrivial amount of math and computational photography that goes into the modern phone camera
Imagine a camera that only takes pictures one pixel wide. Now make it take a picture, for example, 60 times a second and append every pixel-wide image together in order. This is what's happening here, it's a bunch of one pixel wide images ordered by time. The background stays still as it's always the same area captured by that one pixel, resulting in the lines, but moving objects end up looking correct as they're spread out over time.
At first, I thought this explanation would make sense, but then I read back what I just wrote and I'm not sure it really does. Sorry about that.
Yeah, like walking past a door that's cracked just a bit so you can see into an office only a slit. Now reconstruct the whole office from that traveling slit that you saw.
Okay I was stumped about how this works because it's not explained, as far as I can tell. But I guess the sensor array has its long axis perpendicular to the direction the train is traveling.
You can also get close in software. Record some video while walking past a row of shops. Use ffmpeg to explode the video into individual frames. Extract column 0 from every frame, and combine them into a single image, appending each extracted column to the right-hand-side of your output image. You'll end up with something far less accurate than the images in this post, but still fun. Also interesting to try scenes from movies. This technique maps time onto space in interesting ways.
Absolutely fascinating stuff! Thank you so much for adding detailed explanations of the math involved and your process. Always wondered how it worked but never bothered to look it up until today. Reading your page pushed it beyond idle curiosity for me. Thanks for that. And thanks also to HN for always surfacing truly interesting reading material on a daily basis!
What's your FPS/LPS in this setup? I've experimented with similar imaging with an ordinary camera, but LPS was limiting, and I know line-scan machine vision cameras can output some amazing numbers, like 50k+ LPS.
You use a single vertical line of sensors and resample "continuously". When doing this with film, the aperture is a vertical slit and you continuously advance the film during the exposure.
For "finish line" cameras, the slit is located at the finish line and you start pulling film when the horses approach. Since the exposure is continuous, you never miss the exact moment of the finish.
Line scan sensors are basically just scanners, heck people make em out of scanners .
Usually the issue is they need rather still subjects, but in this case rather than the sensor doing a scanning sweep they're just capturing the subject as it moves by, keeping the background pixels static.
It only works for trains because the image of train at t+1 is basically image of train at time t shifted over by a few pixels, right? It doesn't seem like this would work to capture a picture of a human, since humans don't just rigidly translate in space as they move.
If the human is running and doesn't frantically shake it decently works. There's samples of horse race finishing line pics in the article, and they look pretty good IMHO.
It falls apart when the subject is either static or moves it's limbs faster than the speed the whole subject moves (e.g. fist bumping while slowly walking past the camera would screw it)
I love this! I tried to apply the same idea to scan the tallest tree in New England with a drone. It didn't come out great, but I might just try again now.
Here is how it came out: https://www.daviddegner.com/wp-content/uploads/2023/09/Tree-...
It was part of this story: https://www.daviddegner.com/photography/discovering-old-grow...
I have been creating animations using a similar process but with a regular camera and manually splicing the frames together. [1,2,3] The effect is quite interesting in how it forces focus on the subject reducing the background into an abstract pattern. Each 'line' is around 15px wide.
[1] https://youtube.com/shorts/VQuI1wW8hAw [2] https://youtube.com/shorts/vE6kLolf57w [3] https://youtube.com/shorts/QxvFyasQYAY
I also shot a timelapse of the Tokyo skyline at sunset and applied a similar process [4], then motion tracked it so that time is traveling across the frame from left to right[5]. Each line here is 4 pixels wide and the original animation is in 8k.
[4] https://youtu.be/wTma28gwSk0 [5] https://youtu.be/v5HLX5wFEGk
Reminds me of the early experiments with using a flat-bed scanner as a digital back. Here is one: https://www.sentex.net/~mwandel/tech/scanner.html
Does anyone know what it looks like when you use a line scan camera to take a picture of the landscape from a moving car or train? I suspect the parallax produces some interesting distortions.
I've taken a couple of pics from a moving train...
Nankai 6000 series, Osaka:
https://i.dllu.net/nankai_19b8df3e827215a2.jpg
Scenery in France:
https://i.dllu.net/preview_l_b01915cc69f35644.png
Marseille, France:
https://i.dllu.net/preview_raw_7292be4e58de5cd0.png
California:
https://i.dllu.net/preview_raw_d5ec50534991d1a4.png
https://i.dllu.net/preview_raw_e06b551444359536.png
Sorry for the purple trees. The camera is sensitive to near infrared, in which trees are highly reflective, and I haven't taken any trains since buying an IR cut filter. Some of these also have dropped frames and other artifacts.
Exactly what wanted to know. Is it technically feasible to 'scan' a whole landcape of lets say an hour long trainride?
It’s just a blur. Like the background of the photos in this article.
You can get some cool distortions at very slow speeds, but at car or train speeds you won’t see anything
Wow, great article. I love the cable car photo https://upload.wikimedia.org/wikipedia/commons/e/e0/Strip_ph...
Must be somewhat interesting deciding on the background content, too.
Iirc, at the last Olympics, Omega paired a high-frequency linear display with their finish-line strip cameras. Regular cameras saw a flashing line, but the backdrop to photo-finishes was an Omega logo. Very subtle, but impressive to pull off.
More line scan trains: https://news.ycombinator.com/item?id=35738987
IMO the denoising looks rather unnatural and emphasizes the remaining artifacts, especially color fringe around details. Personally I'd leave that turned off. Also, with respect to the demosaic step, I wonder if it's possible to implement a version of RCD [1] for improved resolution without the artifacts that seem to result from the current process.
[1] https://github.com/LuisSR/RCD-Demosaicing
Yeah I actually have it disabled by default since it makes the horizontal stripes more obvious and it's also extremely slow. Also, I found that my vertical stripe correction doesn't work in all cases and sometimes introduces more stripes. Lots more work to do.
As for RCD demosaicing, that's my next step. The color fringing is due to the naive linear interpolation for the red and blue channels. But, with the RCD strategy, if we consider that the green channel has full coverage of the image, we could use it as a guide to make interpolation better.
When you do the demosaicing, and perhaps other steps, did you ever consider declaring the x-positions, spline parameters, ... as latent variables to estimate?
Consider a color histogram, then the logo (showing color oscillations) would have a wider spread and lower peaked histogram versus a correctly mapped (just the few colors plus or minus some noise) which would show a very thin but strong peak in colorspace. A a high-variance color occupation has higher entropy compared to a low-variance strongly centered peak (or multipeak) distribution.
So it seems colorspace entropy could be a strong term in a loss function for optimization (using RMAD).
Do you share some of the original raw recordings somewhere?
Yeah, i dont think the denoised result looks that good either
I like this video about photo finish line camera at a horse track. https://www.youtube.com/watch?v=Ut0nKdLCAEo Maybe someone else will enjoy too.
The way the train sits crisply and motionlessly locked between these perfect stripes of colour gives it an incredible sense of speed.
It's neat that it captured the shadow of the subway train, too, which arrived just ahead of the train itself. This virtual shadow is thrown against a sort of extruded tube with the profile of the slice of track and wall that the slit was pointed at.
Anyone know of a steam train captured in the same way? I'm interested in the effect of the parts with vertical motion such as the pistons and steam clouds, combined with the largely static body.
I looked into line cameras for a project. I think their main application is in quality control of food on conveyer belts. There are plenty of automated sorting systems that can become a bottleneck. One of the units I speced out could record an 8k pixel line at up to 40kfps.
https://youtu.be/E_I9kxHEYYM
They are used in OCT (optical coherence tomography) as well
OCT is a technique which uses IR to get "through" tissue using beam in the near infrared (roughly 950 nm, with a spread of roughly 100 nm). The return is passed through interferometer and what amounts to a diffraction grating to produce the "spread" that the line camera sees. After some signal processing (FFT is a big one), you can get the intensity at depth. If you sweep in X,Y somehow, usually deflecting the beam with a mirror, you can obtain a volumetric image like an MRI or sonogram. Very useful for imaging the eye, particularly the back of the retina where the blood vessels are.
Yah, lots of neat line scan camera applications in spectroscopy. Basically any grating application. 950nm would be on the edge of where you'd implement a Si CCD for OCT as the sensitivity drops as the Si is no longer absorbing. InGaAs detectors are used further in the NIR.
Satellites are also a big use case.
A number of the sats I worked with are single point cameras .. the satellite spins about a major axis orientated in the direction of travel, the point camera rotates with the satellite and a series of points of data are written to a line of storage as the camera points at the earth and pans across as the sat also moves forward.
Data stops being written as the sat rotates the camera away from the planet and resumes once it has rolled over enough to again point at the earth.
It may seem like a pedantic difference; a "line scan camera" is stationary while mirrors inside it spin or another mechanism causes it to "scan" a complete vertical line - perhaps all at once, perhaps as the focal point moves Vs a camera in a satellite that has no moving parts that just records a single point directly in front of the instrument .. and the entire satellite spins and moves forwards.
If you like this sort of thing, check out https://www.magyaradam.com/wp/ too. A lot of his work uses a line scan camera.
The video [https://www.magyaradam.com/wp/?page_id=806] blew my mind. I can only image he reconstructed the video by first reconstructing one frame's worth of slits — then shifting them over by one column and adding the next slit data.
None of the shots in that video are using Slit Scan technique. It’s using a technique called Mean Stack Mode to get the average pixel value across multiple frames, over a rolling selection of an input video.
Super cool. I wonder if you could re-use a regular 2-d CMOS digital camera sensor to the same effect. But now I realize your sensor is basically 1-D and has a 95khz sampling rate. At the same rate with a 4k sensor you'd have way too much data to store and would need to throw most of it away.
Pretty sure could do it but it would be very expensive, because you'd need alot more very fast ADCs.
Like if the camera is $5k, in order to get that exposure time in full-field you would need to duplciate the hardware 800 times or whatever you wanted horizontal resolution to be. Thats alot of zeros for a single camera
Fun read! I used to work in sensor calibration, and most people take for granted how much engineering went into having phones taking good photos. There’s a nontrivial amount of math and computational photography that goes into the modern phone camera
They have an amazing painterly quality. I'm not a huge train fan but I'd put some of these on my wall.
> Hmm, I think my speed estimation still isn’t perfect. It could be off by about 10%.
Probably would be worth asking a train driver about this, e.g. "what is a place with smooth track and constant speed"
Maybe an optical flow sensor to estimate speed in real time?
That's a lot more than I thought I'd want to know about this, but I was totally nerd sniped. Great writeup.
What a beautiful example of image processing. Great post
These are amazing images. I don't understand what's going on here, but I do like the images.
Imagine a camera that only takes pictures one pixel wide. Now make it take a picture, for example, 60 times a second and append every pixel-wide image together in order. This is what's happening here, it's a bunch of one pixel wide images ordered by time. The background stays still as it's always the same area captured by that one pixel, resulting in the lines, but moving objects end up looking correct as they're spread out over time.
At first, I thought this explanation would make sense, but then I read back what I just wrote and I'm not sure it really does. Sorry about that.
No, thank you. This was perfect. It completely explains where the train comes from and where the lines come from.
Lightbulb on.
Aha achieved. (Don’t you love Aha? I love Aha.)
Yeah, like walking past a door that's cracked just a bit so you can see into an office only a slit. Now reconstruct the whole office from that traveling slit that you saw.
Very cool.
It made sense to me!
Okay I was stumped about how this works because it's not explained, as far as I can tell. But I guess the sensor array has its long axis perpendicular to the direction the train is traveling.
The analogue equivalent (a slit scan camera) is easier to understand, I think https://www.lomography.com/magazine/283280-making-a-slit-sca... https://petapixel.com/2017/10/18/role-slit-scan-image-scienc...
You can also get close in software. Record some video while walking past a row of shops. Use ffmpeg to explode the video into individual frames. Extract column 0 from every frame, and combine them into a single image, appending each extracted column to the right-hand-side of your output image. You'll end up with something far less accurate than the images in this post, but still fun. Also interesting to try scenes from movies. This technique maps time onto space in interesting ways.
https://www.youtube.com/watch?v=Ut0nKdLCAEo
Thanks, I added a section called "Principle of operation" to explain how it works.
Absolutely fascinating stuff! Thank you so much for adding detailed explanations of the math involved and your process. Always wondered how it worked but never bothered to look it up until today. Reading your page pushed it beyond idle curiosity for me. Thanks for that. And thanks also to HN for always surfacing truly interesting reading material on a daily basis!
What's your FPS/LPS in this setup? I've experimented with similar imaging with an ordinary camera, but LPS was limiting, and I know line-scan machine vision cameras can output some amazing numbers, like 50k+ LPS.
You use a single vertical line of sensors and resample "continuously". When doing this with film, the aperture is a vertical slit and you continuously advance the film during the exposure.
For "finish line" cameras, the slit is located at the finish line and you start pulling film when the horses approach. Since the exposure is continuous, you never miss the exact moment of the finish.
Line scan sensors are basically just scanners, heck people make em out of scanners .
Usually the issue is they need rather still subjects, but in this case rather than the sensor doing a scanning sweep they're just capturing the subject as it moves by, keeping the background pixels static.
It only works for trains because the image of train at t+1 is basically image of train at time t shifted over by a few pixels, right? It doesn't seem like this would work to capture a picture of a human, since humans don't just rigidly translate in space as they move.
Depends what you're going for.
https://en.wikipedia.org/wiki/Slit-scan_photography#/media/F...
If the human is running and doesn't frantically shake it decently works. There's samples of horse race finishing line pics in the article, and they look pretty good IMHO.
It falls apart when the subject is either static or moves it's limbs faster than the speed the whole subject moves (e.g. fist bumping while slowly walking past the camera would screw it)
How to receive order using panda rider app
reading this is how I imagine it feels to be chatgpt