Wednesday, July 28, 2010

Plot of groove center and width and derivatives

Each plot below contains two sub-plots of the entire view of the data and a portion of it.

Silence+Sine signal

  1. raw CenterRadius: http://kakyoism.webhop.net/~kakyo/0.report/49.stereo/sine/c.png
  2. raw width: http://kakyoism.webhop.net/~kakyo/0.report/49.stereo/sine/w.png
  3. CenterRadius - width: http://kakyoism.webhop.net/~kakyo/0.report/49.stereo/sine/c-w.png
  4. CenterRadius + width: http://kakyoism.webhop.net/~kakyo/0.report/49.stereo/sine/c+w.png
  5. derivative of (CenterRadius - width): http://kakyoism.webhop.net/~kakyo/0.report/49.stereo/sine/d(c-w).png
  6. derivative of (CenterRadius + width): http://kakyoism.webhop.net/~kakyo/0.report/49.stereo/sine/d(c+w).png
  7. derivative of CenterRadius: http://kakyoism.webhop.net/~kakyo/0.report/49.stereo/sine/dc.png
  8. derivative of width: http://kakyoism.webhop.net/~kakyo/0.report/49.stereo/sine/dw.png
  9. deriv(CenterRadius) - deriv(width): http://kakyoism.webhop.net/~kakyo/0.report/49.stereo/sine/dc-dw.png
  10. deriv(CenterRadius) + deriv(width): http://kakyoism.webhop.net/~kakyo/0.report/49.stereo/sine/dc+dw.png

Musical signal

  1. raw CenterRadius: http://kakyoism.webhop.net/~kakyo/0.report/49.stereo/music/c.png
  2. raw width: http://kakyoism.webhop.net/~kakyo/0.report/49.stereo/music/w.png
  3. CenterRadius - width: http://kakyoism.webhop.net/~kakyo/0.report/49.stereo/music/c-w.png
  4. CenterRadius + width: http://kakyoism.webhop.net/~kakyo/0.report/49.stereo/music/c+w.png
  5. derivative of (CenterRadius - width): http://kakyoism.webhop.net/~kakyo/0.report/49.stereo/music/d(c-w).png
  6. derivative of (CenterRadius + width): http://kakyoism.webhop.net/~kakyo/0.report/49.stereo/music/d(c+w).png
  7. derivative of CenterRadius: http://kakyoism.webhop.net/~kakyo/0.report/49.stereo/music/dc.png
  8. derivative of width: http://kakyoism.webhop.net/~kakyo/0.report/49.stereo/music/dw.png
  9. deriv(CenterRadius) - deriv(width): http://kakyoism.webhop.net/~kakyo/0.report/49.stereo/music/dc-dw.png
  10. deriv(CenterRadius) + deriv(width): http://kakyoism.webhop.net/~kakyo/0.report/49.stereo/music/dc+dw.png

 

 

Labels: , ,

Tuesday, July 27, 2010

True stereo samples

Summary

Double checked the stereo signal rendering:

Left channel = CenterRadius - Width

Right channel = CenterRadius + Width

 

Silence + sine

 

Music

  • http://kakyoism.webhop.net/~kakyo/0.report/49.stereo/stereo_music_riaa.wav

BTW:

kakyo writes as 華僑 in Japanese, meaning "overseas Chinese".

 

Labels:

Stereo vs. Groove shape

The music and sine wave extracted last time are stereo signals, but only the lateral information was used. Depths were discarded. To incorporate depth info, however, the bottom data are in question.

Remember that previously we discussed about the fact that grooves in practice are "U" shaped instead of "V" shaped, so that there is no immediate 1-unit-wide bottom to use in order to derive groove depths. In other words, to be able to measure groove depth, we need to thin the bottom so that it is 1-pixel wide. This is an un-reported problem before. A few options are there:

  • Use pure image processing thinning (our current solution)
  • At a groove cross-section, use the bottom point closest to the center of the cross-section.
  • At a groove cross-section, use the bottom point that is deepest.

These options will introduce inevitable noise because they don't consider the fact that the cutting stylus creates some correlation between the groove lateral and vertical shape parameters such as width and depth, the ratio between bottom-to-inner-edge width and bottom-to-outer-edge width, etc. If an inappropriate bottom thinning strategy is used, the overall vertical groove info may not correlate well with the lateral info and won't yield good stereo audio.

We now call top edges "ridges".

To make sure that depth information is good enough to use together with the lateral info, the requirement includes

  • Along the entire groove, the ratios between bottom-to-inner-ridge widths and bottom-to-outer-ridge-widths should be consistent, e.g., at any part of the groove, the ratio remains close to 0.98. Bottom should not be sometimes closer to inner ridge and other times outer ridge.
  • Along the entire groove, the ratios between the groove widths and depths (diff-z between ridges and bottom) should be consistent.

Currently I'm doing tests on groove shapes with raw undulation (not resampled) and resampled /fitted undulations. The tests are not over yet but the preliminary results are not so good. The thinned bottoms apparently contain a lot of noises that fail to meet the above requirements. I'll finished the tests for results from all the thinning strategies listed above and give a thorough report.

Sidewall slopes can provide stereo info. However, since the grooves are U-shaped (also according to Nasce's thesis) and sidewalls are missing with WLI, it is also hard to estimate the sidewall slopes with the tiny portion of the bottom, although I haven't tried.

 

Labels: , ,

Saturday, July 24, 2010

New outline for dissertaion Chapter 1 and 2

Summary

Previously I felt it too awkward to avoid talking about surface metrology or phonograph recording characterisitcs during the narratives of the "related works" section in Chapter 1.  After reconsidering the flow of narration, I think it would be better to keep Chapter 1 very brief and makes it only talk about research objectives and the chapter outline of the rest of the dissertation. Then reserve the true meat of the related works to Chapter 2 as literature review. It would be necessary, of course, to mention that other OAR approaches exist and the general pros of the OAR family. We then do not talk about anything further, including cons, or any more detail of the common characterisitcs of OAR as we do now.

The new outline of Chapter 1 and 2.

Chater 1

  1. General background: One sentence definition of phonograph, the trend of digitization, and the rise of OAR, and its advantages (largely the same as the first three paragraphs that we have now)
  2. Research objectives: (using general terms without touching technical detail)
    1. Focus on stereo disc phonograph recordings.
    2. Study a specific non-contact surface metrology approach: white-light interferometry (WLI), which acquires the 3D surface information of disc recordings and save it as images.
    3. Implement image processing chain to extract audio information from the surface images.
    4. Evaluate the resulting audio quality.
  3. Research contributions:
    • First to focus on stereo recordings with WLI.
  4. Outline of the rest of the chapters

Chaper 2

  1. Phonograph recording technology (Focus on stereo disc recordings)
  2. Surface metrology (Focus on non-contact methods: ray-tracing, optical microscopy, confocal microscopy, WLI)
  3. Existing OAR approaches
    1. with ray-tracing (history of laser turntable that's already written)
    2. with 2D imaging (Stotzer)
    3. with confocal (Haber and McBride)
    4. with optical microscopy (Tian)

 

Labels:

Saturday, July 17, 2010

Center correction for the music sample

Summary
The same center correction process reported last time is performed on the music sample. Although the wavelet differentiation still introduces DC offsets to the signal, the pitch fluctuation is largely removed.

Before center correction:
http://kakyoism.webhop.net/~kakyo/0.report/47.center/audio_mitac_stereo_v2_algo23_RIAA.wav

After center correction:
http://kakyoism.webhop.net/~kakyo/0.report/47.center/audio_mitac_stereo_v2_setup28-cca1-r3484-a0.12pi.wav

Plan
Try to fully automate the process in a coarse-to-fine way, using both magnitude and phase information of the resulting audio from each Monte Carlo center shift
  1. Coarse correction: Using the audio resulting from the non-corrected center info as a reference, if we sample the new center angularly with a fixed shifting radius, depending on where the Monte-Carlo test center is, some of the samples will show themselves as out-of-phase from the reference signal, while others will be in-phase; When corrected audio results from various angular positions are uniformly out-of-phase or in-phase, we know that the position of  corresponding Monte-Carlo center is far from the true center. Then by sampling in the radial dimension, we search a critical point where angular sampling gives us a mix of out-of-phase and in-phase results. The sampling resolution in this step can be very low, i.e., just a few samples will get us to the critical Monte-Carlo center.
  2. Fine correction: We then try taking Monte-Carlo around the critical center with a higher resolution in angular and radial dimensions, and minimize the low-end spectral energy from the magnitude response of the audio results.

Labels: ,

Tuesday, July 13, 2010

Disc center correction to remove wow pitch fluctuation

Summary
A simple iterative center correction procedure is used:
  1. Assume that the current disc center is off from the true center. Randomly choose N center shifts in polar coordinates: Select R radii; for each radius, evenly sampling the angle range from 0 to 2*pi. E.g., radius = 1000 : 3000 (pixels), angle = 0 : pi/8 : 2*pi.
  2. Apply the N shifts individually to the extracted groove undulation polar coordinates and redo fitting,  downsampling, differentiation, and finally outputting the audio.
  3. Examine the low-end energies (0~50Hz) of the N audio results. Find a saddle point in the sorted N energies. The shift radii, whose resulting energies that are adjacent to the saddle point, are chosen as the fine range for the next iteration. 
  4. Keep refining the shift radius range until a satisfying minimum low-end-energy is reached.
  5. If no saddle point is found, then repeat #1~4 until a limit on the iteration count is reached.
  6. If through #1~4 a saddle point is found, but the radius already stabilizes so that no low-end energy change is introduced over the iterations, then change the angles to try to continue fine-tune the low-end-energy.
  7. Repeat #1~6 until the wow pitch fluctuation is lowered to undetectable.
The result is at
http://kakyoism.webhop.net/~kakyo/0.report/47.center/audio_mitac_stereo_v2_cca3-rc3.684402e+02-3.014367e+03.wav

The corresponding center shift correction:
radius: 3.0368mm (about 4.75 FOVs)
angle: 0.1216*pi

Plan:
Use a spiral equation to fit the Cartesian coordinates in the beginning the image processing and see if the wow can be lowered before any center correction attempts.

Labels: ,

Tuesday, June 29, 2010

Archimedean spiral!!

Summary

Record grooves are made mathematically with Archimedean spiral.

Quote from Wikipedia
http://en.wikipedia.org/wiki/Archimedean_spiral

"The Archimedean spiral has a plethora of real-world applications. Scroll compressors, made from two interleaved
Archimedean spirals of the same size, are used for compressing liquids
and gases.[1]
The coils of watch
balance springs and the grooves of very early gramophone records form Archimedean spirals, making the
grooves evenly spaced and maximizing the amount of music that could be
fit onto the record (although this was later changed to allow better
sound quality)"

--Penndorf, Ron. "Early Development of the LP". http://ronpenndorf.com/journalofrecordedmusic5.html. Retrieved 2005-11-25. . See the passage on Variable
Groove
.

What we can learn from the above:

The equation in polar coordinates:

\, r=a+b\theta
The inter-revolution groove spacing: 2*pi*b.

So theoretically with this a priori we can estimate the center of the grooves, without physically measuring the center like we did before with the center-hole or lead-out grooves.

To correct systematic errors in cutting devices, we may need some gradient-descent in the end but it would be much easier than trying to converge to the "imaginary center" blindly.

Test suites
To test this idea, we can do a test suite
1. Check the inter-revolution spacing of many grooves for the mean and variance. We can treat the distance between the edges or bottoms as the target distance. If the variance and mean together show a constant inter-groove spacing, we can go on. Otherwise we need to rethink about the mathematical abstraction. One issue with this idea is the chicken-vs-egg problem. When we test the groove spacing we need to resort to polar coordinates that we already have.

2. Estimate the spiral parameters (a, and b) and eventually the center.

3. Update all groove polar coordinates of our scanned data with the new center and check output signal quality.

Optionally, we can do another test suite:
Generate a sinewave based on Archimedean spiral and see if we can reproduce the low-frequency modulation in the end. And then try to estimate the original center from the wrong signal.

Previously we've confirmed that generated sinewave around a perfect circle can be modified to generate the same modulation by shifting the center.



Labels: ,

Monday, June 28, 2010

Better smoothing

Summary

Using piece-wise polynomial fit on the raw groove edge undulation then using a wavelet differentiator shows better smoothing result than any other combinations we tried before, e.g., moving-average on undulation then differentiation.

The local result on same signal segment shown last week
http://kakyoism.webhop.net/~kakyo/0.report/46.smooth/wavelet_smooth.png

Audio (Sinewave)
http://kakyoism.webhop.net/~kakyo/0.report/46.smooth/audio_mitac_stereo_v2_algo23_NORIAA.wav

It's not clear yet how exactly a wavelet differentiator smoothed the noisy signal out, but it works much better than moving-averaing on raw signal (radii of the groove edges). From the literature I read about, wavelet differentiators have reputations for better noise reduction performance.

I also got what delta-sigma modulation roughly does. It uses pulses (bit-stream) with fixed-voltage and pulse-width, but vary the pulse-spacing, e.g., use denser pulses for weaker signal. By "oversampling", i.e., extremely dense bit-stream, the weaker signal gets a smaller quantization noise than using vertical quantization.

Assuming that our scanned 1kHz sinewave groove undulation is encoded by delta-sigma undulation, and the noise level is one pixel, then because peak-to-dip duration contains about 200 points under 10X magnification (1um/pixel resolution), then the DNR = 20log10(200/1) ~= 46dB. If the noise level is 0.5 pixel, then DNR ~= 52dB.

The reported analog 78RPM systems have 30 ~ 40dB DNR; LP systems
have around 60dB.
Stotzer's system provides about 19dB DNR  for
78rpm and 16dB for LP.

Plan
Start working on center-correction to remove the low-frequency modulation.


Labels: , ,