Deck 7: Multimedia Network Communications and Applications, Wireless Networks and Content-Based Retrieval in Digital Libraries

Full screen (f)
exit full mode
Question
List three QoS parameters for multimedia transmission and explain for some specific applications how the values of these parameters are affected by the application data.
Use Space or
up arrow
down arrow
to flip the card.
Question
How does ATM support multimedia transmission, in particular QoS requests?
Question
Suppose you have a dedicated channel with fixed bandwidth, and you would like to provide channel surfing capabilities which restrict you from using more than one second delay in video decoding.
Question
Among FDMA, TDMA or CDMA, which one provides the most efficient use of the allocated spectrum for multiple access? Justify your choice.
Question
What is texture? Explain in detail what is meant by texture descriptors, for use in content-based image and video search.
Question
Describe the most important features for video search based on content descriptors. If you had to find a single frame to describe an entire video, what features would you use to drive the search?
Question
What constitutes "interactivity" in a multimedia project?
Please discuss briefly the levels of interactivity possible, from least interactive to most interactive.
Question
Write down an algorithm (pseudocode) for defining and calculating a colour histogram for RGB data.
Question
The "hue" is the colour, independent of brightness and how much pure white has been added to it. We can make a simple definition of hue as the set of ratios R:G:B. Suppose a colour (i.e., an RGB) is divided by 2.0, so that the RGB triple now has values 0.5 times its former values. Explain using numerical values:
(a) If gamma-correction is applied after the division by 2.0 and before the colour is stored, does the darker RGB have the same hue as the original in the sense of having the same ratios R:G:B of light emanating from the CRT display device? (we're not discussing any psychophysical effects that change our perception - here we're just worried about the machine itself).
(b) If gamma-correction is not applied, does the second RGB have the same hue as the first RGB, when displayed?
(c) For what colour triples is the hue always unchanged?
Question
Suppose we view a decompressed 512×512 JPEG image, but use only the colour part of the stored image information, not the luminance part, to decompress. What does the 512×512512 \times 512 colour image look like? Assume JPEG is compressed using a 4:2:0 scheme.
Question
Suppose an alphabet consists of 6 symbols, and the probability for each of the symbols is 1/61 / 6 . (Note, log2(3)=1.585\log _{2}(3)=1.585 )
(a) What is the entropy for this set?
(b) Draw the Shannon-Fano tree for this set. What is the average bitrate?
(c) Draw the Huffman tree for this set. What is the average bitrate?
(d) How many bits would we need without compression, assuming fixed-length codewords? What is the compression ratio, compared to the Huffman tree?
Question
Consider an alphabet with two symbols A,BA, B , with probability P(A)=xP(A)=x and P(B)=1xP(B)=1-x . Plot the entropy as a function of xx .
Note: you might want to use log2(3)=1.6,log2(7)=2.8\log _{2}(3)=1.6, \log _{2}(7)=2.8 .
Question
Thinking about my large collection of . jpg images, I decide to unify them and make them more accessible by simply combining them into a big . mpg file by simply treating them as frames in a video: my reasoning is that I can simply use a viewer to step through the file, thus making a cohesive whole out of my collection. Comment on the utility of this idea, in terms of the compression ratio achievable for the set of images.
Question
Please define "motion estimation".
(b) Please define "motion compensation".
Question
Suppose an 8×88 \times 8 image block happens to have the following entries:
183160940000018315300000017900000000000000000000000000000000000000000000000\begin{array}{rrrrrrrr}183&160&94&0&0&0&0&0\\183&153&0&0&0&0&0&0\\179&0&0&0&0&0&0&0\\0&0&0&0&0&0&0&0\\0&0&0&0&0&0&0&0\\0&0&0&0&0&0&0&0\\0&0&0&0&0&0&0&0\\0&0&0&0&0&0&0&0\\\end{array}
(Note that this is a greylevel, 8-bit image, not DCT output).
Now suppose we decide to encode this image into the frequency domain as follows:
• First we go down each column, and carry out a 1-dimensional DCTD C T transform, replacing each column by its set of DCT coefficients.
• However, for the first column we use only a length-3 DCT (i.e., N=3N=3 ); for the second column we use a length-2 DCT, and for the third column we use a length-1 DCT, always leaving zeros in the transform domain just where they appeared in the original, image domain.
• We leave the DC coefficient always at the top of each column processed.
• Then we use the output from the above stage and go on to do the same procedure for rows 1 to 3.
Question:
(a) Which takes more calculations, the above procedure, or the ordinary 2-D DCT transform? Explain.
(b) Broadly, what is the difference, if any, in the output DCT Image between the new transform and the standard one, for this particular image?
Note: One need not do any calculations for this question but, for reference, recall that the 2-D DCT for an M×NM \times N block size is defined as
F(u,v)=2C(u)C(v)MNi=0M1j=0N1cos(2i+1)uπ2Mcos(2j+1)vπ2Nf(i,j),F(u, v)=\frac{2 C(u) C(v)}{\sqrt{M N}} \sum_{i=0}^{M-1} \sum_{j=0}^{N-1} \cos \frac{(2 i+1) \cdot u \pi}{2 M} \cos \frac{(2 j+1) \cdot v \pi}{2 N} f(i, j),
where i,u[0,M1],j,v[0,N1]i, u \in[0, M-1], j, v \in[0, N-1] , and the constants C(u)C(u) and C(v)C(v) are determined by
C(ξ)={22 if ξ=01 otherwise. C(\xi)=\left\{\begin{array}{cl}\frac{\sqrt{2}}{2} & \text { if } \xi=0 \\1 & \text { otherwise. }\end{array}\right.
The 1-D DCT is given by
F(u)=2C(u)Ni=0N1cos(2i+1)uπ2Nf(i)F(u)=\frac{2 C(u)}{\sqrt{N}} \sum_{i=0}^{N-1} \cos \frac{(2 i+1) u \pi}{2 N} f(i)
Question
An original 8×88 \times 8 color "checkerboard" CMY\mathrm{CMY} image is shown below in which the two colors are C1:(C=255,M=155\mathrm{C} 1:(\mathrm{C}=255, \mathrm{M}=155 , Y=255)\mathrm{Y}=255) and C2:(C=M=Y=100)\mathrm{C} 2:(\mathrm{C}=\mathrm{M}=\mathrm{Y}=100) , where [0..255][0 . .255] is the range for the three color components. You are asked to convert the color CMY image to YIQ images using 4:1:1 chroma subsampling. (In subsampling, you should use an averaging method so you are not selectively throwing away information from certain pixels.)  An original  8 \times 8  color checkerboard  \mathrm{CMY}  image is shown below in which the two colors are  \mathrm{C} 1:(\mathrm{C}=255, \mathrm{M}=155 ,  \mathrm{Y}=255)  and  \mathrm{C} 2:(\mathrm{C}=\mathrm{M}=\mathrm{Y}=100) , where  [0 . .255]  is the range for the three color components. You are asked to convert the color CMY image to YIQ images using 4:1:1 chroma subsampling. (In subsampling, you should use an averaging method so you are not selectively throwing away information from certain pixels.)   (a) Show all pixel values of each of the YIQ images generated from the given CMY color image. (b) Besides their low resolution, do the chrominance images maintain enough information in this case? What does this tell? Note: The relationship between RGB and YIQ is approximately:  \left[\begin{array}{l} Y \\ I \\ Q \end{array}\right]=\left[\begin{array}{rrr} 0.3 & 0.6 & 0.1 \\ 0.6 & -0.3 & -0.3 \\ 0.2 & -0.5 & 0.3 \end{array}\right]\left[\begin{array}{l} R \\ G \\ B \end{array}\right] <div style=padding-top: 35px>
(a) Show all pixel values of each of the YIQ images generated from the given CMY color image.
(b) Besides their low resolution, do the chrominance images maintain enough information in this case? What does this tell?
Note: The relationship between RGB and YIQ is approximately:
[YIQ]=[0.30.60.10.60.30.30.20.50.3][RGB]\left[\begin{array}{l}Y \\I \\Q\end{array}\right]=\left[\begin{array}{rrr}0.3 & 0.6 & 0.1 \\0.6 & -0.3 & -0.3 \\0.2 & -0.5 & 0.3\end{array}\right]\left[\begin{array}{l}R \\G \\B\end{array}\right]
Question
Suppose we have a small 8-bit grayscale image, with all pixels equal to the same pixel value, say 113. Consider the performance of an LZW compression scheme. First initialize codes in the dictionary with pixel values, 0..2550 . .255 . Use 9-bit codes.
For a 4×44 \times 4 uniform image made of pixel values which are all 113, how many bits will LZW (i.e., PKZIP, WINZIP, etc.) use for a compressed version of the image? Explain in detail, using an LZW table. What is the compression ratio?
Hint: recall that the LZW coding algorithm is
 Suppose we have a small 8-bit grayscale image, with all pixels equal to the same pixel value, say 113. Consider the performance of an LZW compression scheme. First initialize codes in the dictionary with pixel values,  0 . .255 . Use 9-bit codes. For a  4 \times 4  uniform image made of pixel values which are all 113, how many bits will LZW (i.e., PKZIP, WINZIP, etc.) use for a compressed version of the image? Explain in detail, using an LZW table. What is the compression ratio? Hint: recall that the LZW coding algorithm is   Answer:<div style=padding-top: 35px>
Answer:
Question
Consider a block (8x8 pixels) of an image as shown below. In a particular color plane, the pixel values are as follows:
 Consider a block (8x8 pixels) of an image as shown below. In a particular color plane, the pixel values are as follows:   A standard 2-D DCT for an  8 \times 8  block size is defined as  F(u, v)=\frac{C(u) C(v)}{4} \sum_{i=0}^{7} \sum_{j=0}^{7} \cos \frac{(2 i+1) \cdot u \pi}{16} \cos \frac{(2 j+1) \cdot v \pi}{16} f(i, j),   where  i, j, u, v  are in  0 . .7 , and the constants  C(u)  and  C(v)  are determined by  C(\xi)=\left\{\begin{array}{cl} \frac{\sqrt{2}}{2} & \text { if } \xi=0 \\ 1 & \text { otherwise. } \end{array}\right.  Suppose we compute a DCT  F(u, v) , where  u  is rows and  v  is columns. (a) What value does  F(0,0)  have? Explain. (b) Describe the contents (roughly) of the other components. Explain. Hint: Just thinking about it, rather than calculating everything, will save you time. What are values  F(u, 0) . What are values  F(0, v) . What are other values  F(u, v) .<div style=padding-top: 35px>
A standard 2-D DCT for an 8×88 \times 8 block size is defined as
F(u,v)=C(u)C(v)4i=07j=07cos(2i+1)uπ16cos(2j+1)vπ16f(i,j),F(u, v)=\frac{C(u) C(v)}{4} \sum_{i=0}^{7} \sum_{j=0}^{7} \cos \frac{(2 i+1) \cdot u \pi}{16} \cos \frac{(2 j+1) \cdot v \pi}{16} f(i, j),
where i,j,u,vi, j, u, v are in 0..70 . .7 , and the constants C(u)C(u) and C(v)C(v) are determined by
C(ξ)={22 if ξ=01 otherwise. C(\xi)=\left\{\begin{array}{cl}\frac{\sqrt{2}}{2} & \text { if } \xi=0 \\1 & \text { otherwise. }\end{array}\right.
Suppose we compute a DCT F(u,v)F(u, v) , where uu is rows and vv is columns.
(a) What value does F(0,0)F(0,0) have? Explain.
(b) Describe the contents (roughly) of the other components. Explain.
Hint: Just thinking about it, rather than calculating everything, will save you time. What are values F(u,0)F(u, 0) . What are values F(0,v)F(0, v) .
What are other values F(u,v)F(u, v) .
Question
Why do we use CMY color primaries for printing, instead of RGB ones? Hint: paper is white, not black.
(b) What colour is Yellow and Cyan, printed together? Why?
Question
Draw a curve showing the relationship of the CIELAB brightness axis to the luminance Y.
(b) What curve studied elsewhere in this course does this resemble? Why is that the case?
Question
In Adaptive Huffman coding using a special NYT code and a 5-bit set of initial codes for an input source consisting of 26 characters, which takes more CPU time, encoding or decoding? Briefly explain.
Question
Another name for zig-zag coding is "zonal coding". Suppose you invent a new zonal coding scheme for JPEG that simply discards anti-diagonals above the first few - i.e., we discard the higher frequency ones. Suppose we keep the first six zig-zag lines.
(a) How many coefficients are we keeping?
(b) How will we do, compared to keeping all the zig-zags (still using run-length encoding). Comment on both compression capability and image quality.
Question
Suppose that in MPEG our program detects errors in transmission (over wireless, say), and we know that, for some macroblock, we have correctly received the motion vector, but the DCT coefficient information is damaged. What should we do to promote error concealment?
Answer:
Question
What is the advantage of interlaced video? What are some of its problems?
(b) NTSC video has 525 lines per frame and 63.5μ63.5 \mu sec per line, with 20 lines per field of vertical retrace and 10μ10 \mu sec horizontal retrace.
1) Where does the 63.5μsec63.5 \mu \mathrm{sec} come from?
2) Which takes more time, horizontal retrace or vertical retrace? How much more time?
Question
In many Computer Graphics applications, γ\gamma -correction is performed only in a color LUT (look-up table).
Give pseudocode for how to make such a lookup table, for an 8-bit CLUT, if it is meant for use in γ\gamma -correction.
Show the first 5 entries of the color LUT.
Question
Assume an analog halftoning process uses a screen size of 200 "dots" (disks) per inch, with any size available. Suppose we wish to approximate this digitally, with about 100 intensity levels, not by using an ordered dither but by using an n×nn \times n pattern for each pixel.
How many bilevel dots per inch must our printer be capable of producing?
Question
We have spent time looking at the question of how to minimize the entropy.
Suppose now we find a mechanism for maximizing the entropy instead.
In terms of a grayscale image, this mechanism would re-map the pixel values to new ones. Roughly, what would be the result of such a re-mapping; i.e., what would the resulting image look like?
Answer:
Question
Suppose a square image has N2N^{2} pixels. We would like to approximately know how many pixels there are in total in a (P+1)(P+1) -level image pyramid consisting of the original image plus PP smaller images, each of which is 1/41 / 4 the size. As a first approximation, let's just count all possible levels, down to a size 1x1 image.
(a) First, suppose NN is a power of 2 .
What is an expression for the exact count of pixels in this case, if N=2MN=2^{M} ?
What is the exact count of pixels in this case, if N=16N=16 ? Write the total as a binary and as a decimal number.
(b) Suppose NN is not a power of 2 .
Just give an upper bound for the number of pixels, assuming there are an infinite number of pyramid levels and we can use floats for numbers of pixels. What is this upper bound if N=16N=16 ?
Hint: for x<1x<1 the Taylor series expansion of 1/(1x)1 /(1-x) is 1+x+x2+x3+1+x+x^{2}+x^{3}+\ldots . What does this mean if x=1/4x=1 / 4 ?
Question
State Shannon-Fano Algorithm.
(b) Complete the following table using the Shannon-Fano Algorithm
State Shannon-Fano Algorithm. (b) Complete the following table using the Shannon-Fano Algorithm   (c) What is the entropy of this source, and in what units? Compare to the above result.<div style=padding-top: 35px> (c) What is the entropy of this source, and in what units? Compare to the above result.
Question
Given the Lempel-Ziv-Welch (LZW) Decompression Algorithm,
read a character kk ;
output kk ;
 Given the Lempel-Ziv-Welch (LZW) Decompression Algorithm, read a character  k ; output  k ;   show how the algorithm will decode the following (using the table like that below). THERE_A<259>_<256><259><260><256>INGS! ! Note: Assume that new multi-character entries start at index 256.  <div style=padding-top: 35px>
show how the algorithm will decode the following (using the table like that below).
THERE_A<259>_<256><259><260><256>INGS! !
Note: Assume that new multi-character entries start at index 256.
 Given the Lempel-Ziv-Welch (LZW) Decompression Algorithm, read a character  k ; output  k ;   show how the algorithm will decode the following (using the table like that below). THERE_A<259>_<256><259><260><256>INGS! ! Note: Assume that new multi-character entries start at index 256.  <div style=padding-top: 35px>
Question
In JPEG, each nonzero AC coefficient is described by a composite 8-bit value I='nnnnssss', where 'nnnn' codes the runlength and 'ssss' codes the category. Every DCT coefficient has a category kk , where values are in the range [2k1,2k1]\left[2^{k-1}, 2^{k}-1\right] or [2k+1,2k1]\left[-2^{k}+1,-2^{k-1}\right] , with 1k101 \leq k \leq 10 for the baseline system. For category kk , it is necessary to send kk bits to specify the sign and the magnitude of the actual DCT coefficient itself.
The 4 bits 'nnnn' give the position of the current coefficient relative to the previous nonzero coefficient, i.e., the runlength of zero coefficients from the previous nonzero coefficient. The runlengths specified by 'nnnn' may range from 0 to 15 , and a separate symbol, I=11110000I=' 11110000 ' =240=240 represents a runlength of 16 zero coefficients. If the runlength >16>16 zero coefficients, it is coded by using multiple symbols. A special symbol, I=0\mathrm{I}=0 , is used to code the end of block (EOB), signaling all remaining coefficients in the block are zero.
The composite symbols for each block are then Huffman coded, followed by additional bits for the sign and magnitude of the actual DCT coefficient itself.
Question: How many elements are there in the total symbol set for Huffman coding, encompassing categories, runlengths, and additional symbols? (we're not concerned with the sign and magnitude bits, here.)
Question
In MPEG, what are all the different kinds of frames?
(b) What are they used for?
(c) Is the method that is used for motion compensation in MPEG based on x, y translation the most complicated method for motion compensation in use now in any standard? If so, explain why; if not, explain what other method is used.
(d) Does MPEG video compression require a higher bitrate for video clips that have more action in them? Explain.
Question
Compare and contrast the
(a) bandwidth (i.e., bitrate), and
(b) playback requirements of uncompressed digital audio and video.
Question
What is Signal to Quantization Noise Ratio (SQNR)?
(b) How does an additional 2 bits affect the SQNR?
(c) Explain why the worst SQNR occurs when the sample equals half of the interval.
Question
Dissolve: Suppose we have video1 dissolving into video2, over a time tt from 0 to tmaxt_{\max } (video1 gradually disappears, and video2 gradually appears).
There are 2 ways that this task is commonly carried out: "Ordinary Dissolve" and "Dither Dissolve". In Ordinary Dissolve, every pixel value is changed, over time, so that it contains partly the contents of video1 and partly the contents of video2, summed additively. In Dither Dissolve, pixels are either all-video1 or all-video2, not a mix; the decision of which video to take pixel values from is based on a random number generator.
Write pseudocode solutions for accomplishing these two kinds of gradual video transition. For each type, just show the algorithm for filling up R (Red) values - Green and Blue will be similar.
Question
In MIDI, for Channel Messages, how many different "opcodes" can there be in the Status Byte? Why?
Question
In many Computer Graphics applications, γ\gamma - correction is performed only in a color LUT (look-up table). Show the first 5 entries of the color LUT if it is meant for use in γ\gamma -correction.
Question
To makes matters simpler for eventual printing, we buy a camera equipped with CMY sensors, as opposed to RGB sensors (CMY cameras are in fact available).
(a) Draw spectral curves roughly depicting what such a camera's sensitivity to wavelength might look like.
(b) Could the output of a CMY camera be used to produce ordinary RGB pictures? How?
Question
What is the meaning of the "horseshoe" shape in the chromaticity diagram?
(b) Where does that curve come from? - i.e., how is it calculated?
Question
Suppose you wish to transmit a stereo audio signal through a 1 mega-bit/s connection in real time. Consider the following scenarios:
i) You are using a sampling frequency of 44.1kHz44.1 \mathrm{kHz} . What is the maximum average number of bits can you use to represent an audio sample?
ii) You want to use 16bit/sample/channel16 \mathrm{bit} / \mathrm{sample} / \mathrm{channel} representation. What is the maximum sampling frequency? What will you need to do in order to avoid aliasing?
iii) You want to use a sampling frequency of 44.1KHz44.1 \mathrm{KHz} , and also want use 16bit/sample/channel16 \mathrm{bit} / \mathrm{sample} / \mathrm{channel} representation. What is the minimum compression ratio you need in order to transmit the audio signal?
Question
Explain the following terms:
(a) Image Resolution
(b) Bitmap
Question
Generally, for gray input images what are half-toning and dithering? How are they related to each other? What is ordered dithering?
Question
In the simplest version of the median-cut algorithm, does it make any difference whether we assign bits in the order RGBRGBRG, or GBRGBRGB, etc. Explain.
(b) Suppose we decide to quantize an 8-bit grayscale image down to just 2 bits of accuracy. What is the simplest way to do so? What ranges of byte values in the original image are mapped to what quantized values?
Question
How would you create your own video wipe transition from the top-left corner of the viewport down to the bottom-right corner - a diagonal transition? Fig. 1 shows such a video transition.
 How would you create your own video wipe transition from the top-left corner of the viewport down to the bottom-right corner - a diagonal transition? Fig. 1 shows such a video transition.   Figure 1: Wipe transition, at  \mathrm{t} / \mathrm{tmax}=0.66  Here, we wish to simply take pixels from either the first or the second video, depending on whether they are above or below the moving diagonal line. Write some pseudo-C or pseudo-Premiere pseudocode to produce correct pixel values during the transition. Hint: for any  x  and  y  position, we can determine where the line that  \{x, y\}  is on, which is parallel to the wipe, cuts the main diagonal from top-left to bottom-right of the frame, simply using similar triangles. To do so, it's easiest to calculate the y-intersept of that line (where it hits the y-axis).   Figure 2: Wipe transition geometry.<div style=padding-top: 35px>
Figure 1: Wipe transition, at t/tmax=0.66\mathrm{t} / \mathrm{tmax}=0.66 Here, we wish to simply take pixels from either the first or the second video, depending on whether they are above or below the moving diagonal line.
Write some pseudo-C or pseudo-Premiere pseudocode to produce correct pixel values during the transition.
Hint: for any xx and yy position, we can determine where the line that {x,y}\{x, y\} is on, which is parallel to the wipe, cuts the main diagonal from top-left to bottom-right of the frame, simply using similar triangles. To do so, it's easiest to calculate the y-intersept of that line (where it hits the y-axis).
 How would you create your own video wipe transition from the top-left corner of the viewport down to the bottom-right corner - a diagonal transition? Fig. 1 shows such a video transition.   Figure 1: Wipe transition, at  \mathrm{t} / \mathrm{tmax}=0.66  Here, we wish to simply take pixels from either the first or the second video, depending on whether they are above or below the moving diagonal line. Write some pseudo-C or pseudo-Premiere pseudocode to produce correct pixel values during the transition. Hint: for any  x  and  y  position, we can determine where the line that  \{x, y\}  is on, which is parallel to the wipe, cuts the main diagonal from top-left to bottom-right of the frame, simply using similar triangles. To do so, it's easiest to calculate the y-intersept of that line (where it hits the y-axis).   Figure 2: Wipe transition geometry.<div style=padding-top: 35px>
Figure 2: Wipe transition geometry.
Question
   etc. Notes: '  x  ' in status byte hex value stands for a channel number. (a) 1) Is the Pitch Bend MIDI message a Channel Message? 2) The Pitch Bend opcode in MIDI is followed by two data bytes specifying how the control is to be altered. How many bits of accuracy does this amount of data correspond to? Why? 3) The MIDI communications standard specifies 31250 bps (bits per sec); how many Pitch Bend messages could be sent in 3 seconds if the message stream consisted only of Pitch Bend messages? (b) The note A above Middle C (with frequency  440 \mathrm{~Hz}  is note 69 in General MIDI. What MIDI bytes (in hex) should be sent to play a note twice the frequency of (i.e., one octave above) A above Middle C at maximum volume on channel 1? (Don't include start/stop bits.) Information: An octave is 12 steps on a piano, i.e., 12 notes up.<div style=padding-top: 35px>  etc.
Notes: ' xx ' in status byte hex value stands for a channel number.
(a) 1) Is the Pitch Bend MIDI message a Channel Message?
2) The Pitch Bend opcode in MIDI is followed by two data bytes specifying how the control is to be altered. How many bits of accuracy does this amount of data correspond to? Why?
3) The MIDI communications standard specifies 31250 bps (bits per sec); how many Pitch Bend messages could be sent in 3 seconds if the message stream consisted only of Pitch Bend messages?
(b) The note "A above Middle C" (with frequency 440 Hz440 \mathrm{~Hz} is note 69 in General MIDI. What MIDI bytes (in hex) should be sent to play a note twice the frequency of (i.e., one octave above) "A above Middle C" at maximum volume on channel 1? (Don't include start/stop bits.) Information: An octave is 12 steps on a piano, i.e., 12 notes up.
Question
When we create a sprite, we usually completely replace pixel values by those in the sprite. Suppose, however, we wish to replace pixel values by a combination with weights α\alpha from the original background, plus (α1)(\alpha-1) from the sprite, where α\alpha is in [0,1][0,1] .
State how you could do this, using only LOGICAL operators and an ADD.
For this question, assume the following: (1) You already have a sprite SαS_{\alpha} which has the character's colors all multiplied by the factor α\alpha , and black outside the character;
(2) You already have a second image of the (entire) background, already multiplied by (1α)(1-\alpha) , called B1αB_{1-\alpha} .
Answer:
Question
It is known that a loss of audio output at both ends of the audible frequency range is inevitable due to the frequency response function of audio amplifier and medium (e.g., tape).
(a) If the output was 1 volt for frequencies at mid-range, after a loss of 3 dB\quad-3 \mathrm{~dB} at 18kHz18 \mathrm{kHz} what is the output voltage at this frequency? [Hint: Assume log102=0.3\log _{10} 2=0.3 .]
(b) To compensate the loss, a listener can adjust the gain (and hence the output) at different frequencies from an equalizer. If the loss remains 3 dB-3 \mathrm{~dB} and a gain through the equalizer is 6 dB6 \mathrm{~dB} at 18kHz18 \mathrm{kHz} , what is the output voltage now?
Question
The note "A above Middle C" (with frequency 440 Hz440 \mathrm{~Hz} ) is note 69 in General MIDI. What MIDI bytes (in hex) should be sent to play a note twice the frequency of (i.e., one octave above) "A above Middle C" at maximum volume on "Channel 1"? (Don't include start/stop bits.)
Information: An octave is 12 steps on a piano, i.e., 12 notes up.
 The note A above Middle C (with frequency  440 \mathrm{~Hz} ) is note 69 in General MIDI. What MIDI bytes (in hex) should be sent to play a note twice the frequency of (i.e., one octave above) A above Middle C at maximum volume on Channel 1? (Don't include start/stop bits.) Information: An octave is 12 steps on a piano, i.e., 12 notes up.   (b) What bytes should be sent immediately after that?<div style=padding-top: 35px>  (b) What bytes should be sent immediately after that?
Question
Briefly, for grey input images explain what half-toning and dithering are. How are they related to each other? What is ordered dithering?
Question
Suppose we acquire a video which has been compressed using Motion-JPEG, and import it into Adobe Premiere (or a similar program). Then we create a movie using an MPEG-4 codec. Comment on the
(a) compression ratio
(b) appearance of the resulting video.
Question
When we view video on a computer, the analog video is digitized and stored in the frame buffer of the video "frame grabber" card.
Suppose that a video is digitized at (integer) NTSC frame rate, has size 640×480640 \times 480 pixels, and is stored with a bit depth of 24 bits. We're interested in displaying the captured video.
(a) What must be the minimal bandwidth of the system bus in Mbps when data is moved from the video frame grabber to the memory for video display?
(b) How much storage capacity (in GBytes) is required to store 1 minute of this video?
(c) Explain why you don't see a flicker effect on your workstation screen when displaying this video at NTSC frame rate?
Question
The "hue" is the colour, independent of brightness and how much pure white has been added to it. We can make a simple definition of hue as the set of ratios R:G:B.
Suppose a colour (i.e., an RGB) is divided by 2.0, so that the RGB triple now has values 0.5 times its former values.
Explain using numerical values:
(a) If gamma-correction is not applied, does the second RGB have the same hue as the first RGB, when displayed? (we're not discussing any psychophyisical effects that change our perception - here we're just worried about the machine itself).
(b) State all colour triples for which the hue is unchanged.
Unlock Deck
Sign up to unlock the cards in this deck!
Unlock Deck
Unlock Deck
1/52
auto play flashcards
Play
simple tutorial
Full screen (f)
exit full mode
Deck 7: Multimedia Network Communications and Applications, Wireless Networks and Content-Based Retrieval in Digital Libraries
1
List three QoS parameters for multimedia transmission and explain for some specific applications how the values of these parameters are affected by the application data.
Not Answer
2
How does ATM support multimedia transmission, in particular QoS requests?
Not Answer
3
Suppose you have a dedicated channel with fixed bandwidth, and you would like to provide channel surfing capabilities which restrict you from using more than one second delay in video decoding.
Not Answer
4
Among FDMA, TDMA or CDMA, which one provides the most efficient use of the allocated spectrum for multiple access? Justify your choice.
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
5
What is texture? Explain in detail what is meant by texture descriptors, for use in content-based image and video search.
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
6
Describe the most important features for video search based on content descriptors. If you had to find a single frame to describe an entire video, what features would you use to drive the search?
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
7
What constitutes "interactivity" in a multimedia project?
Please discuss briefly the levels of interactivity possible, from least interactive to most interactive.
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
8
Write down an algorithm (pseudocode) for defining and calculating a colour histogram for RGB data.
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
9
The "hue" is the colour, independent of brightness and how much pure white has been added to it. We can make a simple definition of hue as the set of ratios R:G:B. Suppose a colour (i.e., an RGB) is divided by 2.0, so that the RGB triple now has values 0.5 times its former values. Explain using numerical values:
(a) If gamma-correction is applied after the division by 2.0 and before the colour is stored, does the darker RGB have the same hue as the original in the sense of having the same ratios R:G:B of light emanating from the CRT display device? (we're not discussing any psychophysical effects that change our perception - here we're just worried about the machine itself).
(b) If gamma-correction is not applied, does the second RGB have the same hue as the first RGB, when displayed?
(c) For what colour triples is the hue always unchanged?
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
10
Suppose we view a decompressed 512×512 JPEG image, but use only the colour part of the stored image information, not the luminance part, to decompress. What does the 512×512512 \times 512 colour image look like? Assume JPEG is compressed using a 4:2:0 scheme.
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
11
Suppose an alphabet consists of 6 symbols, and the probability for each of the symbols is 1/61 / 6 . (Note, log2(3)=1.585\log _{2}(3)=1.585 )
(a) What is the entropy for this set?
(b) Draw the Shannon-Fano tree for this set. What is the average bitrate?
(c) Draw the Huffman tree for this set. What is the average bitrate?
(d) How many bits would we need without compression, assuming fixed-length codewords? What is the compression ratio, compared to the Huffman tree?
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
12
Consider an alphabet with two symbols A,BA, B , with probability P(A)=xP(A)=x and P(B)=1xP(B)=1-x . Plot the entropy as a function of xx .
Note: you might want to use log2(3)=1.6,log2(7)=2.8\log _{2}(3)=1.6, \log _{2}(7)=2.8 .
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
13
Thinking about my large collection of . jpg images, I decide to unify them and make them more accessible by simply combining them into a big . mpg file by simply treating them as frames in a video: my reasoning is that I can simply use a viewer to step through the file, thus making a cohesive whole out of my collection. Comment on the utility of this idea, in terms of the compression ratio achievable for the set of images.
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
14
Please define "motion estimation".
(b) Please define "motion compensation".
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
15
Suppose an 8×88 \times 8 image block happens to have the following entries:
183160940000018315300000017900000000000000000000000000000000000000000000000\begin{array}{rrrrrrrr}183&160&94&0&0&0&0&0\\183&153&0&0&0&0&0&0\\179&0&0&0&0&0&0&0\\0&0&0&0&0&0&0&0\\0&0&0&0&0&0&0&0\\0&0&0&0&0&0&0&0\\0&0&0&0&0&0&0&0\\0&0&0&0&0&0&0&0\\\end{array}
(Note that this is a greylevel, 8-bit image, not DCT output).
Now suppose we decide to encode this image into the frequency domain as follows:
• First we go down each column, and carry out a 1-dimensional DCTD C T transform, replacing each column by its set of DCT coefficients.
• However, for the first column we use only a length-3 DCT (i.e., N=3N=3 ); for the second column we use a length-2 DCT, and for the third column we use a length-1 DCT, always leaving zeros in the transform domain just where they appeared in the original, image domain.
• We leave the DC coefficient always at the top of each column processed.
• Then we use the output from the above stage and go on to do the same procedure for rows 1 to 3.
Question:
(a) Which takes more calculations, the above procedure, or the ordinary 2-D DCT transform? Explain.
(b) Broadly, what is the difference, if any, in the output DCT Image between the new transform and the standard one, for this particular image?
Note: One need not do any calculations for this question but, for reference, recall that the 2-D DCT for an M×NM \times N block size is defined as
F(u,v)=2C(u)C(v)MNi=0M1j=0N1cos(2i+1)uπ2Mcos(2j+1)vπ2Nf(i,j),F(u, v)=\frac{2 C(u) C(v)}{\sqrt{M N}} \sum_{i=0}^{M-1} \sum_{j=0}^{N-1} \cos \frac{(2 i+1) \cdot u \pi}{2 M} \cos \frac{(2 j+1) \cdot v \pi}{2 N} f(i, j),
where i,u[0,M1],j,v[0,N1]i, u \in[0, M-1], j, v \in[0, N-1] , and the constants C(u)C(u) and C(v)C(v) are determined by
C(ξ)={22 if ξ=01 otherwise. C(\xi)=\left\{\begin{array}{cl}\frac{\sqrt{2}}{2} & \text { if } \xi=0 \\1 & \text { otherwise. }\end{array}\right.
The 1-D DCT is given by
F(u)=2C(u)Ni=0N1cos(2i+1)uπ2Nf(i)F(u)=\frac{2 C(u)}{\sqrt{N}} \sum_{i=0}^{N-1} \cos \frac{(2 i+1) u \pi}{2 N} f(i)
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
16
An original 8×88 \times 8 color "checkerboard" CMY\mathrm{CMY} image is shown below in which the two colors are C1:(C=255,M=155\mathrm{C} 1:(\mathrm{C}=255, \mathrm{M}=155 , Y=255)\mathrm{Y}=255) and C2:(C=M=Y=100)\mathrm{C} 2:(\mathrm{C}=\mathrm{M}=\mathrm{Y}=100) , where [0..255][0 . .255] is the range for the three color components. You are asked to convert the color CMY image to YIQ images using 4:1:1 chroma subsampling. (In subsampling, you should use an averaging method so you are not selectively throwing away information from certain pixels.)  An original  8 \times 8  color checkerboard  \mathrm{CMY}  image is shown below in which the two colors are  \mathrm{C} 1:(\mathrm{C}=255, \mathrm{M}=155 ,  \mathrm{Y}=255)  and  \mathrm{C} 2:(\mathrm{C}=\mathrm{M}=\mathrm{Y}=100) , where  [0 . .255]  is the range for the three color components. You are asked to convert the color CMY image to YIQ images using 4:1:1 chroma subsampling. (In subsampling, you should use an averaging method so you are not selectively throwing away information from certain pixels.)   (a) Show all pixel values of each of the YIQ images generated from the given CMY color image. (b) Besides their low resolution, do the chrominance images maintain enough information in this case? What does this tell? Note: The relationship between RGB and YIQ is approximately:  \left[\begin{array}{l} Y \\ I \\ Q \end{array}\right]=\left[\begin{array}{rrr} 0.3 & 0.6 & 0.1 \\ 0.6 & -0.3 & -0.3 \\ 0.2 & -0.5 & 0.3 \end{array}\right]\left[\begin{array}{l} R \\ G \\ B \end{array}\right]
(a) Show all pixel values of each of the YIQ images generated from the given CMY color image.
(b) Besides their low resolution, do the chrominance images maintain enough information in this case? What does this tell?
Note: The relationship between RGB and YIQ is approximately:
[YIQ]=[0.30.60.10.60.30.30.20.50.3][RGB]\left[\begin{array}{l}Y \\I \\Q\end{array}\right]=\left[\begin{array}{rrr}0.3 & 0.6 & 0.1 \\0.6 & -0.3 & -0.3 \\0.2 & -0.5 & 0.3\end{array}\right]\left[\begin{array}{l}R \\G \\B\end{array}\right]
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
17
Suppose we have a small 8-bit grayscale image, with all pixels equal to the same pixel value, say 113. Consider the performance of an LZW compression scheme. First initialize codes in the dictionary with pixel values, 0..2550 . .255 . Use 9-bit codes.
For a 4×44 \times 4 uniform image made of pixel values which are all 113, how many bits will LZW (i.e., PKZIP, WINZIP, etc.) use for a compressed version of the image? Explain in detail, using an LZW table. What is the compression ratio?
Hint: recall that the LZW coding algorithm is
 Suppose we have a small 8-bit grayscale image, with all pixels equal to the same pixel value, say 113. Consider the performance of an LZW compression scheme. First initialize codes in the dictionary with pixel values,  0 . .255 . Use 9-bit codes. For a  4 \times 4  uniform image made of pixel values which are all 113, how many bits will LZW (i.e., PKZIP, WINZIP, etc.) use for a compressed version of the image? Explain in detail, using an LZW table. What is the compression ratio? Hint: recall that the LZW coding algorithm is   Answer:
Answer:
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
18
Consider a block (8x8 pixels) of an image as shown below. In a particular color plane, the pixel values are as follows:
 Consider a block (8x8 pixels) of an image as shown below. In a particular color plane, the pixel values are as follows:   A standard 2-D DCT for an  8 \times 8  block size is defined as  F(u, v)=\frac{C(u) C(v)}{4} \sum_{i=0}^{7} \sum_{j=0}^{7} \cos \frac{(2 i+1) \cdot u \pi}{16} \cos \frac{(2 j+1) \cdot v \pi}{16} f(i, j),   where  i, j, u, v  are in  0 . .7 , and the constants  C(u)  and  C(v)  are determined by  C(\xi)=\left\{\begin{array}{cl} \frac{\sqrt{2}}{2} & \text { if } \xi=0 \\ 1 & \text { otherwise. } \end{array}\right.  Suppose we compute a DCT  F(u, v) , where  u  is rows and  v  is columns. (a) What value does  F(0,0)  have? Explain. (b) Describe the contents (roughly) of the other components. Explain. Hint: Just thinking about it, rather than calculating everything, will save you time. What are values  F(u, 0) . What are values  F(0, v) . What are other values  F(u, v) .
A standard 2-D DCT for an 8×88 \times 8 block size is defined as
F(u,v)=C(u)C(v)4i=07j=07cos(2i+1)uπ16cos(2j+1)vπ16f(i,j),F(u, v)=\frac{C(u) C(v)}{4} \sum_{i=0}^{7} \sum_{j=0}^{7} \cos \frac{(2 i+1) \cdot u \pi}{16} \cos \frac{(2 j+1) \cdot v \pi}{16} f(i, j),
where i,j,u,vi, j, u, v are in 0..70 . .7 , and the constants C(u)C(u) and C(v)C(v) are determined by
C(ξ)={22 if ξ=01 otherwise. C(\xi)=\left\{\begin{array}{cl}\frac{\sqrt{2}}{2} & \text { if } \xi=0 \\1 & \text { otherwise. }\end{array}\right.
Suppose we compute a DCT F(u,v)F(u, v) , where uu is rows and vv is columns.
(a) What value does F(0,0)F(0,0) have? Explain.
(b) Describe the contents (roughly) of the other components. Explain.
Hint: Just thinking about it, rather than calculating everything, will save you time. What are values F(u,0)F(u, 0) . What are values F(0,v)F(0, v) .
What are other values F(u,v)F(u, v) .
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
19
Why do we use CMY color primaries for printing, instead of RGB ones? Hint: paper is white, not black.
(b) What colour is Yellow and Cyan, printed together? Why?
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
20
Draw a curve showing the relationship of the CIELAB brightness axis to the luminance Y.
(b) What curve studied elsewhere in this course does this resemble? Why is that the case?
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
21
In Adaptive Huffman coding using a special NYT code and a 5-bit set of initial codes for an input source consisting of 26 characters, which takes more CPU time, encoding or decoding? Briefly explain.
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
22
Another name for zig-zag coding is "zonal coding". Suppose you invent a new zonal coding scheme for JPEG that simply discards anti-diagonals above the first few - i.e., we discard the higher frequency ones. Suppose we keep the first six zig-zag lines.
(a) How many coefficients are we keeping?
(b) How will we do, compared to keeping all the zig-zags (still using run-length encoding). Comment on both compression capability and image quality.
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
23
Suppose that in MPEG our program detects errors in transmission (over wireless, say), and we know that, for some macroblock, we have correctly received the motion vector, but the DCT coefficient information is damaged. What should we do to promote error concealment?
Answer:
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
24
What is the advantage of interlaced video? What are some of its problems?
(b) NTSC video has 525 lines per frame and 63.5μ63.5 \mu sec per line, with 20 lines per field of vertical retrace and 10μ10 \mu sec horizontal retrace.
1) Where does the 63.5μsec63.5 \mu \mathrm{sec} come from?
2) Which takes more time, horizontal retrace or vertical retrace? How much more time?
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
25
In many Computer Graphics applications, γ\gamma -correction is performed only in a color LUT (look-up table).
Give pseudocode for how to make such a lookup table, for an 8-bit CLUT, if it is meant for use in γ\gamma -correction.
Show the first 5 entries of the color LUT.
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
26
Assume an analog halftoning process uses a screen size of 200 "dots" (disks) per inch, with any size available. Suppose we wish to approximate this digitally, with about 100 intensity levels, not by using an ordered dither but by using an n×nn \times n pattern for each pixel.
How many bilevel dots per inch must our printer be capable of producing?
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
27
We have spent time looking at the question of how to minimize the entropy.
Suppose now we find a mechanism for maximizing the entropy instead.
In terms of a grayscale image, this mechanism would re-map the pixel values to new ones. Roughly, what would be the result of such a re-mapping; i.e., what would the resulting image look like?
Answer:
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
28
Suppose a square image has N2N^{2} pixels. We would like to approximately know how many pixels there are in total in a (P+1)(P+1) -level image pyramid consisting of the original image plus PP smaller images, each of which is 1/41 / 4 the size. As a first approximation, let's just count all possible levels, down to a size 1x1 image.
(a) First, suppose NN is a power of 2 .
What is an expression for the exact count of pixels in this case, if N=2MN=2^{M} ?
What is the exact count of pixels in this case, if N=16N=16 ? Write the total as a binary and as a decimal number.
(b) Suppose NN is not a power of 2 .
Just give an upper bound for the number of pixels, assuming there are an infinite number of pyramid levels and we can use floats for numbers of pixels. What is this upper bound if N=16N=16 ?
Hint: for x<1x<1 the Taylor series expansion of 1/(1x)1 /(1-x) is 1+x+x2+x3+1+x+x^{2}+x^{3}+\ldots . What does this mean if x=1/4x=1 / 4 ?
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
29
State Shannon-Fano Algorithm.
(b) Complete the following table using the Shannon-Fano Algorithm
State Shannon-Fano Algorithm. (b) Complete the following table using the Shannon-Fano Algorithm   (c) What is the entropy of this source, and in what units? Compare to the above result. (c) What is the entropy of this source, and in what units? Compare to the above result.
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
30
Given the Lempel-Ziv-Welch (LZW) Decompression Algorithm,
read a character kk ;
output kk ;
 Given the Lempel-Ziv-Welch (LZW) Decompression Algorithm, read a character  k ; output  k ;   show how the algorithm will decode the following (using the table like that below). THERE_A<259>_<256><259><260><256>INGS! ! Note: Assume that new multi-character entries start at index 256.
show how the algorithm will decode the following (using the table like that below).
THERE_A<259>_<256><259><260><256>INGS! !
Note: Assume that new multi-character entries start at index 256.
 Given the Lempel-Ziv-Welch (LZW) Decompression Algorithm, read a character  k ; output  k ;   show how the algorithm will decode the following (using the table like that below). THERE_A<259>_<256><259><260><256>INGS! ! Note: Assume that new multi-character entries start at index 256.
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
31
In JPEG, each nonzero AC coefficient is described by a composite 8-bit value I='nnnnssss', where 'nnnn' codes the runlength and 'ssss' codes the category. Every DCT coefficient has a category kk , where values are in the range [2k1,2k1]\left[2^{k-1}, 2^{k}-1\right] or [2k+1,2k1]\left[-2^{k}+1,-2^{k-1}\right] , with 1k101 \leq k \leq 10 for the baseline system. For category kk , it is necessary to send kk bits to specify the sign and the magnitude of the actual DCT coefficient itself.
The 4 bits 'nnnn' give the position of the current coefficient relative to the previous nonzero coefficient, i.e., the runlength of zero coefficients from the previous nonzero coefficient. The runlengths specified by 'nnnn' may range from 0 to 15 , and a separate symbol, I=11110000I=' 11110000 ' =240=240 represents a runlength of 16 zero coefficients. If the runlength >16>16 zero coefficients, it is coded by using multiple symbols. A special symbol, I=0\mathrm{I}=0 , is used to code the end of block (EOB), signaling all remaining coefficients in the block are zero.
The composite symbols for each block are then Huffman coded, followed by additional bits for the sign and magnitude of the actual DCT coefficient itself.
Question: How many elements are there in the total symbol set for Huffman coding, encompassing categories, runlengths, and additional symbols? (we're not concerned with the sign and magnitude bits, here.)
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
32
In MPEG, what are all the different kinds of frames?
(b) What are they used for?
(c) Is the method that is used for motion compensation in MPEG based on x, y translation the most complicated method for motion compensation in use now in any standard? If so, explain why; if not, explain what other method is used.
(d) Does MPEG video compression require a higher bitrate for video clips that have more action in them? Explain.
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
33
Compare and contrast the
(a) bandwidth (i.e., bitrate), and
(b) playback requirements of uncompressed digital audio and video.
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
34
What is Signal to Quantization Noise Ratio (SQNR)?
(b) How does an additional 2 bits affect the SQNR?
(c) Explain why the worst SQNR occurs when the sample equals half of the interval.
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
35
Dissolve: Suppose we have video1 dissolving into video2, over a time tt from 0 to tmaxt_{\max } (video1 gradually disappears, and video2 gradually appears).
There are 2 ways that this task is commonly carried out: "Ordinary Dissolve" and "Dither Dissolve". In Ordinary Dissolve, every pixel value is changed, over time, so that it contains partly the contents of video1 and partly the contents of video2, summed additively. In Dither Dissolve, pixels are either all-video1 or all-video2, not a mix; the decision of which video to take pixel values from is based on a random number generator.
Write pseudocode solutions for accomplishing these two kinds of gradual video transition. For each type, just show the algorithm for filling up R (Red) values - Green and Blue will be similar.
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
36
In MIDI, for Channel Messages, how many different "opcodes" can there be in the Status Byte? Why?
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
37
In many Computer Graphics applications, γ\gamma - correction is performed only in a color LUT (look-up table). Show the first 5 entries of the color LUT if it is meant for use in γ\gamma -correction.
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
38
To makes matters simpler for eventual printing, we buy a camera equipped with CMY sensors, as opposed to RGB sensors (CMY cameras are in fact available).
(a) Draw spectral curves roughly depicting what such a camera's sensitivity to wavelength might look like.
(b) Could the output of a CMY camera be used to produce ordinary RGB pictures? How?
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
39
What is the meaning of the "horseshoe" shape in the chromaticity diagram?
(b) Where does that curve come from? - i.e., how is it calculated?
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
40
Suppose you wish to transmit a stereo audio signal through a 1 mega-bit/s connection in real time. Consider the following scenarios:
i) You are using a sampling frequency of 44.1kHz44.1 \mathrm{kHz} . What is the maximum average number of bits can you use to represent an audio sample?
ii) You want to use 16bit/sample/channel16 \mathrm{bit} / \mathrm{sample} / \mathrm{channel} representation. What is the maximum sampling frequency? What will you need to do in order to avoid aliasing?
iii) You want to use a sampling frequency of 44.1KHz44.1 \mathrm{KHz} , and also want use 16bit/sample/channel16 \mathrm{bit} / \mathrm{sample} / \mathrm{channel} representation. What is the minimum compression ratio you need in order to transmit the audio signal?
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
41
Explain the following terms:
(a) Image Resolution
(b) Bitmap
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
42
Generally, for gray input images what are half-toning and dithering? How are they related to each other? What is ordered dithering?
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
43
In the simplest version of the median-cut algorithm, does it make any difference whether we assign bits in the order RGBRGBRG, or GBRGBRGB, etc. Explain.
(b) Suppose we decide to quantize an 8-bit grayscale image down to just 2 bits of accuracy. What is the simplest way to do so? What ranges of byte values in the original image are mapped to what quantized values?
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
44
How would you create your own video wipe transition from the top-left corner of the viewport down to the bottom-right corner - a diagonal transition? Fig. 1 shows such a video transition.
 How would you create your own video wipe transition from the top-left corner of the viewport down to the bottom-right corner - a diagonal transition? Fig. 1 shows such a video transition.   Figure 1: Wipe transition, at  \mathrm{t} / \mathrm{tmax}=0.66  Here, we wish to simply take pixels from either the first or the second video, depending on whether they are above or below the moving diagonal line. Write some pseudo-C or pseudo-Premiere pseudocode to produce correct pixel values during the transition. Hint: for any  x  and  y  position, we can determine where the line that  \{x, y\}  is on, which is parallel to the wipe, cuts the main diagonal from top-left to bottom-right of the frame, simply using similar triangles. To do so, it's easiest to calculate the y-intersept of that line (where it hits the y-axis).   Figure 2: Wipe transition geometry.
Figure 1: Wipe transition, at t/tmax=0.66\mathrm{t} / \mathrm{tmax}=0.66 Here, we wish to simply take pixels from either the first or the second video, depending on whether they are above or below the moving diagonal line.
Write some pseudo-C or pseudo-Premiere pseudocode to produce correct pixel values during the transition.
Hint: for any xx and yy position, we can determine where the line that {x,y}\{x, y\} is on, which is parallel to the wipe, cuts the main diagonal from top-left to bottom-right of the frame, simply using similar triangles. To do so, it's easiest to calculate the y-intersept of that line (where it hits the y-axis).
 How would you create your own video wipe transition from the top-left corner of the viewport down to the bottom-right corner - a diagonal transition? Fig. 1 shows such a video transition.   Figure 1: Wipe transition, at  \mathrm{t} / \mathrm{tmax}=0.66  Here, we wish to simply take pixels from either the first or the second video, depending on whether they are above or below the moving diagonal line. Write some pseudo-C or pseudo-Premiere pseudocode to produce correct pixel values during the transition. Hint: for any  x  and  y  position, we can determine where the line that  \{x, y\}  is on, which is parallel to the wipe, cuts the main diagonal from top-left to bottom-right of the frame, simply using similar triangles. To do so, it's easiest to calculate the y-intersept of that line (where it hits the y-axis).   Figure 2: Wipe transition geometry.
Figure 2: Wipe transition geometry.
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
45
   etc. Notes: '  x  ' in status byte hex value stands for a channel number. (a) 1) Is the Pitch Bend MIDI message a Channel Message? 2) The Pitch Bend opcode in MIDI is followed by two data bytes specifying how the control is to be altered. How many bits of accuracy does this amount of data correspond to? Why? 3) The MIDI communications standard specifies 31250 bps (bits per sec); how many Pitch Bend messages could be sent in 3 seconds if the message stream consisted only of Pitch Bend messages? (b) The note A above Middle C (with frequency  440 \mathrm{~Hz}  is note 69 in General MIDI. What MIDI bytes (in hex) should be sent to play a note twice the frequency of (i.e., one octave above) A above Middle C at maximum volume on channel 1? (Don't include start/stop bits.) Information: An octave is 12 steps on a piano, i.e., 12 notes up. etc.
Notes: ' xx ' in status byte hex value stands for a channel number.
(a) 1) Is the Pitch Bend MIDI message a Channel Message?
2) The Pitch Bend opcode in MIDI is followed by two data bytes specifying how the control is to be altered. How many bits of accuracy does this amount of data correspond to? Why?
3) The MIDI communications standard specifies 31250 bps (bits per sec); how many Pitch Bend messages could be sent in 3 seconds if the message stream consisted only of Pitch Bend messages?
(b) The note "A above Middle C" (with frequency 440 Hz440 \mathrm{~Hz} is note 69 in General MIDI. What MIDI bytes (in hex) should be sent to play a note twice the frequency of (i.e., one octave above) "A above Middle C" at maximum volume on channel 1? (Don't include start/stop bits.) Information: An octave is 12 steps on a piano, i.e., 12 notes up.
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
46
When we create a sprite, we usually completely replace pixel values by those in the sprite. Suppose, however, we wish to replace pixel values by a combination with weights α\alpha from the original background, plus (α1)(\alpha-1) from the sprite, where α\alpha is in [0,1][0,1] .
State how you could do this, using only LOGICAL operators and an ADD.
For this question, assume the following: (1) You already have a sprite SαS_{\alpha} which has the character's colors all multiplied by the factor α\alpha , and black outside the character;
(2) You already have a second image of the (entire) background, already multiplied by (1α)(1-\alpha) , called B1αB_{1-\alpha} .
Answer:
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
47
It is known that a loss of audio output at both ends of the audible frequency range is inevitable due to the frequency response function of audio amplifier and medium (e.g., tape).
(a) If the output was 1 volt for frequencies at mid-range, after a loss of 3 dB\quad-3 \mathrm{~dB} at 18kHz18 \mathrm{kHz} what is the output voltage at this frequency? [Hint: Assume log102=0.3\log _{10} 2=0.3 .]
(b) To compensate the loss, a listener can adjust the gain (and hence the output) at different frequencies from an equalizer. If the loss remains 3 dB-3 \mathrm{~dB} and a gain through the equalizer is 6 dB6 \mathrm{~dB} at 18kHz18 \mathrm{kHz} , what is the output voltage now?
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
48
The note "A above Middle C" (with frequency 440 Hz440 \mathrm{~Hz} ) is note 69 in General MIDI. What MIDI bytes (in hex) should be sent to play a note twice the frequency of (i.e., one octave above) "A above Middle C" at maximum volume on "Channel 1"? (Don't include start/stop bits.)
Information: An octave is 12 steps on a piano, i.e., 12 notes up.
 The note A above Middle C (with frequency  440 \mathrm{~Hz} ) is note 69 in General MIDI. What MIDI bytes (in hex) should be sent to play a note twice the frequency of (i.e., one octave above) A above Middle C at maximum volume on Channel 1? (Don't include start/stop bits.) Information: An octave is 12 steps on a piano, i.e., 12 notes up.   (b) What bytes should be sent immediately after that? (b) What bytes should be sent immediately after that?
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
49
Briefly, for grey input images explain what half-toning and dithering are. How are they related to each other? What is ordered dithering?
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
50
Suppose we acquire a video which has been compressed using Motion-JPEG, and import it into Adobe Premiere (or a similar program). Then we create a movie using an MPEG-4 codec. Comment on the
(a) compression ratio
(b) appearance of the resulting video.
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
51
When we view video on a computer, the analog video is digitized and stored in the frame buffer of the video "frame grabber" card.
Suppose that a video is digitized at (integer) NTSC frame rate, has size 640×480640 \times 480 pixels, and is stored with a bit depth of 24 bits. We're interested in displaying the captured video.
(a) What must be the minimal bandwidth of the system bus in Mbps when data is moved from the video frame grabber to the memory for video display?
(b) How much storage capacity (in GBytes) is required to store 1 minute of this video?
(c) Explain why you don't see a flicker effect on your workstation screen when displaying this video at NTSC frame rate?
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
52
The "hue" is the colour, independent of brightness and how much pure white has been added to it. We can make a simple definition of hue as the set of ratios R:G:B.
Suppose a colour (i.e., an RGB) is divided by 2.0, so that the RGB triple now has values 0.5 times its former values.
Explain using numerical values:
(a) If gamma-correction is not applied, does the second RGB have the same hue as the first RGB, when displayed? (we're not discussing any psychophyisical effects that change our perception - here we're just worried about the machine itself).
(b) State all colour triples for which the hue is unchanged.
Unlock Deck
Unlock for access to all 52 flashcards in this deck.
Unlock Deck
k this deck
locked card icon
Unlock Deck
Unlock for access to all 52 flashcards in this deck.