Scaled reassigned spectrograms applied to linear transducer signals

: This study evaluates the applicability of scaled reassigned spectrograms (ReSTS) on ultrasound radio frequency data obtained with a clinical linear array ultrasound transducer. The ReSTS’s ability to resolve axially closely spaced objects in a phantom is compared to the classical cross-correlation method with respect to the ability to resolve closely spaced objects as individual reﬂectors using ultrasound pulses with different lengths. The results show that the axial resolution achieved with the ReSTS was superior to the cross-correlation method when the reﬂected pulses from two objects overlap. A novel B-mode imaging method, facilitating higher image resolution for distinct reﬂectors, is proposed. V C 2021 Author(s). All article content, except where otherwise noted, is licensed under a Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/) .


Introduction
To achieve higher axial resolution B-mode images in medical ultrasound, a common goal is to make the ultrasound pulses as short in time as possible.It is well known that the axial resolution of the machine is largely governed by the spatial length of the pulse, where the axial resolution in an ultrasound image is commonly defined as half the spatial full width at half maximum (FWHM) pulse length (Ng and Swanevelder, 2011).Reflective objects with less than this spacing in the axial direction will cast individual reflections that overlap so severely in time that they will appear as a single reflector in the B-mode image.
The image resolution of ultrasound machines can be improved by several signal and image processing techniques (Demirli and Saniie, 2001;Merdjana et al., 2018;€ Ozkul et al., 2019).However, just a few techniques focus on the radio frequency (RF) signals that in fact contain more acoustic information, i.e., both frequency and phase, besides temporal information.Several methods aim to increase the spatial resolution of a point scatterer using deconvolution (Yeoh and Zhang, 2006;Chira, 2011;Jin et al., 2016;Kageyama et al., 2013), where the signal is modelled as a convolution between the transmitted pulse and the acoustic reflectivity function.The usual deconvolution is performed in the time-spatial domain and the entire frequency range of the noise is thereby affecting the estimate.The frequency-based techniques, typically based on the Wiener filter are affected by the disturbance in the close range of the transducer frequency.However, for an accurate estimate the method also requires an estimate of the noise variance which might introduce errors in the resulting Wiener filter.Furthermore, the Wiener filter is a technique assuming stationary properties of the signal.Superresolution imaging is a term that is becoming increasingly common in this field of research.It refers to methods that combine multiple, slightly different, low-resolution images to create one high-resolution image.Motion between different low-resolution images, either in regular ultrasound imaging (Wu et al., 2014) or after injecting microbubbles (Errico et al., 2015;Yu et al., 2018;Chen et al., 2021), are often utilized to achieve this.However, due to the large number of imaging frames, super-resolution imaging is generally limited by low temporal resolution and often requires motion correction of the assumed stationary background.
The scaled reassigned spectrogram and its recent developments (Hansson-Sandsten and Brynolfsson, 2015;Brynolfsson and Sandsten, 2018;Brynolfsson et al., 2021) is a time-frequency reassignment technique (Kodera et al., 1976;Auger and Flandrin, 1995;Flandrin, 2013), designed to estimate the center time and center frequency of an oscillating pulse.This allows for visual separation of overlapping pulses.A method for automatic detection of individual Gaussian shaped and oscillating pulses in a noisy environment was presented in Reinhold et al. (2018).The method is referred to as reassigned spectrogram for transient signals (ReSTS).This has so far only been tested on single element source and receiver systems, in dolphin echolocation research (Starkhammar et al., 2019) and complementary strictly simulated/synthetic signals (Reinhold et al., 2018).In this study, we evaluate the potential of using the ReSTS also on RF signals generated and obtained with a linear array ultrasound transducer, constructed out of 192 source and receiver elements.The ReSTS's ability to detect closely spaced reflectors is compared with the cross-correlation method.

General measurement set-up
An ultrasound advanced open platform [ULA-OP (Tortoli et al., 2009)] equipped with a 4-13 MHz linear array transducer (LA523, Esaote, Florence, Italy) with 192 elements was used to examine a tissue-mimicking phantom (model 040GSE, CIRS, Norfolk, VA).The ultrasound transducer was held in place by a clamp stand to ensure that the angle of the probe, and the contact area between the probe and the surface of the phantom was constant during all measurements.Ultrasound gel was used to ensure a good acoustic coupling between transducer and phantom.
Figure 1 shows an illustration of the examined structures inside the ultrasound phantom.White dots represent the locations of reflective nylon wires inside the phantom.The diameter of the wires is approximately 80 lm.The two areas of interest and specifically used in the analysis are highlighted by dashed, red circles numbered 1 and 2.
The ULA-OP scanner was set to generate approximately four period long pulses with variable frequency content.The aperture used was 64 elements (pitch 245 lm) and the focus depth was set on the target at 3 cm depth.The variation of the frequency content made it possible to obtain different pulse lengths, where the longest (worst axial resolution) was chosen so that detection of the smallest reflector spacing would be impossible using only regular cross correlation.The shortest pulse length (best axial resolution) was chosen such that visual detection of both reflectors would be trivial.The range of pulse lengths was intended to encompass the transition from correct detection to detection failure due to limited axial resolution for both tested methods.The range of pulse lengths was varied by varying the transmit frequencies f in in the ULA-OP from 4.0 to 12.0 MHz in 0.5 MHz steps.Note that since these frequencies span most of the transducer bandwidth, the true frequencies of the transmitted pulses were expected to differ due to physical limitations of the transducer.However, the exact frequency content was not in itself of interest in this study and the range of the generated pulse lengths was found adequate as a testing variable.The acquired data from the tissue-mimicking phantom was stored as RF data and imported to MATLAB R2020a (MathWorks, Natick, MA, USA) for post-processing.
The phantom area, used for investigating the ReSTS's ability to resolve echoes from closely spaced reflectors, is shown in Fig. 1 as a red circle marked with 1 (area 1).Approximated pulse lengths of the transmitted signals were obtained by measuring the FWHM of the reflected signals from a separate reflector in the phantom, the red circle marked with 2 (area 2) in Fig. 1.The resulting peak frequencies f pk and the 3 dB bandwidths of the pulses were also measured in area 2 for all transmitted signals.
Although the positions of the closely spaced reflectors in area 1 were not axially aligned, the widths/diameters of the reflectors were such that acoustical components from both reflectors in area 1 would overlap for at least one axial data line.The true distances were measured with a high-resolution ultrasound system (Visualsonics 2100, FUJIFILM VisualSonics, Inc., Toronto, ON, Canada) equipped with a 20 MHz center frequency transducer.Five measurements were taken, and their mean and median values calculated.The measurements were taken manually on the machine by zooming in on area 1 in the phantom and by using the inbuilt measure functionality where a distance between selected pixels was calculated automatically by the machine.
As examples of the different evaluated frequencies and pulse lengths, the B-mode image of the measured ultrasound signal is shown in Fig. 2, where Fig. 2(a) originates from RF data of a pulse with relatively high frequency (HF) content (f in ¼ 9.5 MHz as input parameter to ULA-OP) and thus shorter FWHM pulse length.Fig. 2(c) shows the same information but from data obtained with a low frequency (LF) pulse (f in ¼ 5.0 MHz) with longer FWHM pulse length.Figures 2(b) and 2(d) show the magnifications of the corresponding area 1 in Fig. 1, where the solid vertical white line in both images depict the studied axial data line for all tested pulses.

Analysis
All measured RF signals were filtered with a 4th order Butterworth bandpass filter with upper and lower cut-off frequencies of 20 MHz and 100 kHz respectively, and then further analyzed using the ReSTS as developed by Reinhold et al.   1 (5), 052001 (2021)  1, 052001-2 (2018).This method was chosen since it was developed for the purpose of detecting and separating overlaying oscillating pulses, and because it has been shown to perform well in applied settings (Starkhammar et al., 2019).
The ReSTS method builds on the reassignment technique for short signals.This method requires calculation of the spectrogram of the recorded signal y(t) using the Gaussian analysis window and calculation of the reassignment coordinates where F th r ðt; f Þ is calculated in the same way as F h r ðt; f Þ, but with window function t Á h r ðtÞ.The reassignment coordinates are used to change the energy distribution of the spectrogram by mapping the energy from location (t, f) to ð tr ðt; f Þ; f r ðt; f ÞÞ.The new energy distribution is called the reassigned spectrogram, RS r ðt; f Þ.This reassignment technique assumes that the envelopes of the pulses in y(t) are Gaussian functions, equal, except in amplitude, to the window function h r ðtÞ.
In theory, with no disturbances or too close pulses in the signal, and the window exactly matching the length and shape of the pulse envelopes, the RS r ðt; f Þ will give perfectly localized peaks at the time and frequency centers of individual pulses.Thus, in ideal conditions, the reassigned spectrogram is a time-frequency image of the arrival times and frequencies of the pulses.However, for recordings, when there are disturbances, the pulses overlap, and the pulse envelopes are not exactly known, the reassigned spectrogram image is less clear, and the ReSTS method is needed.The ReSTS calculates the reassigned spectrogram and automatically analyzes the image to find estimates of the arrival times and frequencies of the recorded pulses.The automatic detection is done by using a predefined resolution limit in time and frequency.The result of the ReSTS method is an estimate of the ideal time-frequency image, showing peaks only at the arrival times (t n ) and frequencies (f n ) of the individual pulses.
The resolution limits for Gaussian functions are in theory (no disturbances) 2r in time and 1=ðrpÞ in frequency, where r is the scaling parameter as seen in Eq. ( 2) (Holzmann and Vollmer, 2008).It has been shown that for reasonable amounts of disturbances and in single element source and receiver systems, the ReSTS reaches those resolution limits (Reinhold et al., 2018).In this study the axial resolution offered by the ReSTS for multi-element transducers was tested by  applying the method to the studied axial data line of area 1 in the phantom, see Fig. 1, i.e., where the two reflectors in that area make the acoustical components overlap.
For reliable estimates of the arrival times and frequencies, the ReSTS requires an estimate of the pulse length.This means finding a suitable estimate for the scaling parameter r in Eq. ( 2), where an increase in r corresponds to a longer pulse in time.As described in the general measurement set-up, the pulse length is estimated by measuring the FWHM.For Gaussian functions an estimate of r, can be obtained from that measured pulse length.Another possibility is to evaluate the time-frequency localization of the ReSTS for different candidate r.The localization is measured by calculating the local R enyi entropy around a single peak where b ¼ 3 as per discussion in Baraniuk et al. (2001).The measure is calculated in the local time-frequency area (t n 6Dt; f n 6Df ), where (t n , f n ) represents the nth pulse's estimated arrival time and frequency (Brynolfsson and Sandsten, 2018;Hansson-Sandsten and Brynolfsson, 2015).With the normalization of the signal energy in Eq. ( 5), the R enyi entropy measures for different r is comparable.The optimal estimate of r is given from the best time-frequency localization, identified by the lowest R enyi entropy.
A number of preliminary candidate values of r were used as input to the ReSTS, all related to the initial estimate r FWHM , where X ¼ ð0:50; 0:55; …; 1:50Þ.The R enyi entropies in area 1 were calculated as the mean of the two local R enyi entropy values obtained from the two time-frequency peaks (t 1 , f 1 ) and (t 2 , f 2 ), with Dt ¼ 0:2 ls and Df ¼ 0:49 MHz for all estimations of r i .This procedure was repeated for a range of transmit frequencies f in in order to find the optimal r i for all 17 cases of the output pulse lengths.As described in the general measurement set-up, the output pulse length was indirectly varied by varying the set transmit frequency parameter in ULA-OP.Initial analysis of the data clearly showed that the r i for X ¼ ð1:05; 1:10; …; 1:50Þ (r i > r FWHM ) was not useful for detecting severely overlapping transient signals.This matches well with theory as such a long analysis window result in a higher resolution in frequency at the expense of poorer resolution in time.Since we here are interested in separating pulses in time, and not in frequency, the use of shorter analysis windows, than given by the initial guess r FWHM , represents the interesting cases for further analysis.Therefore, the tested combinations of transmit frequency f in and final candidate r i values, for X ¼ ð0:50; 0:55; …; 1Þ, are listed in Table 1.
In order to compare the require computational time cost for the ReSTS method with the cross-correlation method the time required for calculating one line and detect one reflector was measured for the two methods.The computational times were measured 100 times using the inbuilt "tic-toc" function in MATLAB for two different lengths of sample lines, 102 samples (centred around reflector) and 3200 samples (full line length), respectively, obtained from the line crossing the reflector in area 2. To make the comparison as fair to both the methods as possible it was important to make sure that the reference signals for both methods was equally long.Therefore, a Hermite function of 512 samples was used in the ReSTS and a 512 long reference signal was used as input to the cross-correlation function.As an additional measure of the computational times the built in "timeit" function in MATLAB was also used for the two methods, and their output compared.All tests were run on a standard laptop Intel(R) Core(TM) i7-9850H, 2.59 GHz processor.
A novel B-mode image was generated by summarizing each line in the scaled reassigned time-frequency representation for all frequencies and taking the envelope of the resulting time signal corresponding to the time-frequency marginal.We show the results from this using the candidate r 4 value.However, the same operation can be made for any candidate sigma.

Results
Table 2 shows the mean R enyi entropies obtained from area 1 for each transmit frequency f in and the different input r i from Table 1.The transmit frequency f in is the input setting parameter to ULA-OP in MHz, and for each value of f in the minimum mean R enyi entropy is marked in bold.In Table 3 the measured reflector distances (ds ¼ jt 1 À t 2 j) obtained from area 1 in the phantom are presented for different input center frequencies f in together with the measured peak frequency f pk and bandwidth bw in MHz, of the signal in area 2. The gray cells are results from r estimates with the minimum R enyi entropies.For comparison, the RF signal from the studied axial data line in area 1 was cross correlated with the RF signal obtained from area 2 in Fig. 1.The resulting time indexes of the detected cross-correlation peaks are also listed in Table 3, denoted as x corr .
The mean and median value of all the distances generated from the optimal r (bold numbers) from f in ¼ 4.5 to 12 MHz, was 358 and 362 lm, respectively.The mean and median value of the distance as detected with cross correlation from f in ¼ 6.5 to 12 MHz was 375 and 377 lm, respectively.The true mean and median distance of the closely spaced reflectors in area 1 was measured to 329 and 338 lm, respectively, with the high-resolution ultrasound research system (Visualsonics 2100).The mean values of ds obtained by ReSTS differed from the true distance by 8.8%.The mean values of ds measured with cross correlation differed from the true distance by 14.0%.
The axial resolution test is also exemplified in Fig. 4 where the time detections of each reflection originating from the two reflectors in area 1 in Fig. 1 is marked with a dashed red line.Gray solid lines show the RF pulse of the reflected signal of the LF and HF pulse.Black solid lines show the envelopes of the RF pulses of the reflected signal using the LF and HF pulse.2. Calculated mean of the local R enyi entropy values of the two estimated time-frequency peak values for each case of input r i to ReSTS of the signal in area 1.For each value of f in (MHz), the found minimum mean R enyi entropy is marked using a gray background cell color and the values are typed in bold face.Cases where only one time-frequency peak is found is marked with -and cases where more than two peaks are estimated are marked with Â.None of those cases are of interest for further analysis.The mean values of the times required for calculating one line and detect the reflector using ReSTS with the "tic-toc" function in MATLAB were measured to 22 ms with a standard deviation of 5.8 ms for the 102 sample long data snippet.The minimum calculation time was 19.7 ms and the maximum 69.0 ms.Corresponding values for the 3200 samples long data snippet were 619 ms and 15 ms, using ReSTS.Here, the minimum calculation time was 600 ms and the maximum 714 ms.The mean values of the times required for calculating the 102 samples long data snippet and detect the reflector using the cross-correlation method were measured to 19 ms with a standard deviation of 4.7 ms.The minimum time was measured to 0.95 ms and the maximum was 40 ms.The corresponding values for the full length line was 2.5 ms with a standard deviation of 4.7 ms.The minimum time was measured to 1.3 ms and the maximum 40 ms.The "timeit" time measurements of the 102 sample signal was 21.4 ms for the ReSTS and 1.4 ms for the cross-correlation method.Corresponding values for the 3200 sample signal was 630 ms and 2.0 ms, respectively.
The novel B-mode image, based on the scaled reassigned time-frequency representation, is presented in   (MHz).The resulting measured peak frequencies and bandwidths, f pk and bw, both in MHz, from the signal in area 2, are also presented for each value of f in .Cells corresponding to the measured minimum R enyi entropies are gray (same locations as in Table 2).For comparison, the time indexes using cross correlation are presented as x corr .Cases where only one time-frequency peak is found are marked by 0 and cases where more than two peaks are estimated are marked with Â.None of these cases are of interest for further analysis.

Discussion
The measured distances using both the traditional cross-correlation method and the ReSTS correspond well the correct value of the distance between the reflectors in area 1 in Fig. 1.However, the ReSTS arrived at a distance closer to the true distance (ReSTS:8.8%off vs cross correlation:14.0%off).The ReSTS proved to be able to detect the individual arrival times of the echoes from the reflectors for longer pulses compared to the cross-correlation method.Hence, the ReSTS can be used to improve the axial resolution of the ultrasound image.This is also verified by Fig. 3 where (b) and (d) obtained with the ReSTS shows higher axial resolution for both the HF and LF case than (a) and (c) using traditional B-mode imaging.
For short pulses, high frequencies f in , it was noted that the actual use of r FWHM , as input to ReSTS, gave a distance estimate ds, which was actually closer to the true distance compared to using the optimal r i corresponding to the lowest local R enyi entropy.However, for long pulses, low frequencies f in , the use of the optimal r i % 0:5r FWHM , as input to ReSTS, managed identification of two peaks and a corresponding estimated distance between these, close to the true distance.This was not the case with use of input r FWHM , nor with use of the cross-correlation method.
Figure 4 shows that the proposed method can detect and correctly estimate the arrival times of the two overlapping pulses, originating from the two reflectors in area 1.This offers a new and objective method to further improve the axial resolution of ultrasound machines.The computational cost of the ReSTS method is about 11 times higher than that of the cross-correlation method for the tested shorter data snippets and 250 for the full length data line.In situations where it is important to achieve a higher resolution, e.g., in thickness estimations of thin membranes, it is a reasonable cost.We anticipate that a normal procedure to use the method for thickness measurements would be to apply the ReSTS on small subsets of data where the user manually picks the region of interest in a subsection of the larger dataset.In such cases the 11 times longer computational effort required is still less than a second.However, the suggested B-mode image using the ReSTS may require more work to be practical in runtime.However, this method has not yet been optimized for speed, as the cross-correlation function in MATLAB has been.Although speculative, a low level implementation of the ReSTS as a built in function in MATLAB or any other software package may even out the discrepancy.
The results suggest that the ReSTS is robust to frequency differences of the transducer.This might especially be relevant in diagnostics of deeply lying organs and when thin layers of tissue or closely separated details are studied but a lower frequency transducer has to be used in order to acoustically reach the deeper lying tissues.In addition, ultrasound in veterinary medicine of large mammals like horses and livestock may also benefit from the method as lower frequency transducers are often used due to the higher acoustic attenuation properties of their tissues and sheer size.The results presented here strongly motivate future studies of the applicability of ReSTS on RF ultrasound data obtained from ex vivo tissue samples where results can be compared with observations obtained through dissection.

Conclusion
The ReSTS showed better performance than the cross-correlation method in terms of the ability to find the separate arrival times of heavily overlaying echoes from closely spaced reflectors in an ultrasound phantom.This allows for better axial 4. Detections of reflectors of (a) the LF signal and (b) the HF signal.Amplitudes are normalized to each signal.The detected time of each reflector is marked with a dashed line in red and their sample numbers are written out on the order of their relative amplitude in the ReSTS as t D1 and t D2 .resolution and/or longer pulse lengths of the transducer.The possibility to use longer pulses also means more acoustic energy per pulse in the lower frequency regions and thus potentially deeper tissue penetration as higher frequencies are more attenuated by the tissue than lower frequencies.These promising results motivate future studies of the ReSTS applicability on biological tissue samples.

Fig. 1 .
Fig. 1.Ultrasound phantom structure.Dashed, red circles mark areas where individual RF data lines were analyzed.
Fig. 2. (a) B-mode image of phantom obtained with a HF pulse (f in ¼ 9.5 MHz) and the data line chosen for axial resolution analysis magnified in (b).(c) B-mode image of phantom obtained with a LF pulse (f in ¼ 5.0 MHz) and the data line chosen for axial resolution analysis magnified in (d).

Fig. 3 .
Fig. 3. (a) and (c) traditional B-mode image of phantom obtained with HF (top) and LF (bottom) signal, (b) and (d) new reassigned B-mode image of HF (top) and LF (bottom) signal.The linear grayscales have been individually adjusted to enhance visual clarity for all four subplots (a)-(d).

Table 1 .
Candidate input parameters r i to ReSTS for all different transmit frequencies f in presented in MHz.

Table 3 .
Estimated distances ds i (lm) of area 1 based on r i as input parameter to ReSTS and different f in