Deck 5: Lossy Compression Algorithms, Image Compression Standards and Basic Video Compression Techniques

ملء الشاشة (f)
exit full mode
سؤال
Explain why the factorization steps in Eqs. (8.24) and (8.25) save CPU cycles. About how much faster is the factorized approach to the DCT than the straightforward definition in (8.17)? (In both approaches, assume that of course the cosine matrix values are pre-stored outside any loop.)
استخدم زر المسافة أو
up arrow
down arrow
لقلب البطاقة.
سؤال
The top part of Figure 8.1 shows the DCT F(u)F(u) for a 1-D signal f(i)f(i) .
i. What does F(5)F(5) tell you in terms of the image content in f(i)f(i) ?
ii. What is the reason that F(5)F(5) is less than 0 ?
(b) The Inverse Discrete Cosine Transform (IDCT) can be viewed as a reconstruction process in which the Cosine basis functions are scaled and added one by one to reconstruct f~(i)\tilde{f}(i) . Assume F(0)=500F(0)=500 and F(1)=70F(1)=-70 ; in the space provided at the bottom part of the figure,
i. draw f~(i)\tilde{f}(i) after only the F(0)F(0) is used,
ii. draw f~(i)\tilde{f}(i) after F(0)F(0) and F(1)F(1) are used.
Show all details including magnitudes, and explain why they look like that.
سؤال
Suppose we have 32 samples from a mono audio file.
(a) If the audio was sampled at 8kHz8 \mathrm{kHz} , how many seconds does this represent?
(b) If we carry out a DCT with blocksize equal to 32, how many DCT coefficients do we end up with?
(c) Numbering the DCT coefficients from 0,
i) What frequency, in Hz\mathrm{Hz} , does coefficient number 1 represent?
ii) What frequency, in Hz\mathrm{Hz} , does coefficient number 16 represent?
سؤال
Another name for zig-zag coding is "zonal coding". Suppose you invent a new zonal coding scheme for JPEG that simply discards diagonals above the first few. Suppose we keep the first six zig-zag lines.
(a) How many coefficients are we keeping?
(b) How will we do, compared to keeping all the zig-zags (still using run-length encoding)? Comment on both compression capability and image quality.
سؤال
Briefly explain why DCT was chosen for JPEG compression.
سؤال
Why does JPEG give compression? At which points in the algorithm does compression occur? Please estimate (as a general guess) how much compression occurs at each of these points.
سؤال
Explain why the block DCT is preferred to taking the whole image DCT in JPEG compression.
سؤال
Describe how H.261 deals with temporal and spatial redundancies in video.
سؤال
Given the following two consecutive frames in an H.261 video, if the first image is treated as an Iframe and the second a P-frame. Show in detail how the second image is encoded. (Work out the numbers at each step. You can assume any quantizer and stop after the RLC (Run-length Coding).)
To simplify the data, we assume that the size of macroblocks is 8×88 \times 8 instead of 16×1616 \times 16 .
Intensity Values for the First Image:
 Given the following two consecutive frames in an H.261 video, if the first image is treated as an Iframe and the second a P-frame. Show in detail how the second image is encoded. (Work out the numbers at each step. You can assume any quantizer and stop after the RLC (Run-length Coding).) To simplify the data, we assume that the size of macroblocks is  8 \times 8  instead of  16 \times 16 . Intensity Values for the First Image:   Intensity Values for the Second Image:  <div style=padding-top: 35px>
Intensity Values for the Second Image:
 Given the following two consecutive frames in an H.261 video, if the first image is treated as an Iframe and the second a P-frame. Show in detail how the second image is encoded. (Work out the numbers at each step. You can assume any quantizer and stop after the RLC (Run-length Coding).) To simplify the data, we assume that the size of macroblocks is  8 \times 8  instead of  16 \times 16 . Intensity Values for the First Image:   Intensity Values for the Second Image:  <div style=padding-top: 35px>
سؤال
We have seen that a logarithmic-based block search strategy is fast: it is "suboptimal but still usually effective." That being the case, why should we be bothered carrying out a full search in any situation? Why not just stick to the fast method - what is the advantage to be gained, if any, from a comprehensive, optimal search strategy?
فتح الحزمة
قم بالتسجيل لفتح البطاقات في هذه المجموعة!
Unlock Deck
Unlock Deck
1/10
auto play flashcards
العب
simple tutorial
ملء الشاشة (f)
exit full mode
Deck 5: Lossy Compression Algorithms, Image Compression Standards and Basic Video Compression Techniques
1
Explain why the factorization steps in Eqs. (8.24) and (8.25) save CPU cycles. About how much faster is the factorized approach to the DCT than the straightforward definition in (8.17)? (In both approaches, assume that of course the cosine matrix values are pre-stored outside any loop.)
Not Answer
2
The top part of Figure 8.1 shows the DCT F(u)F(u) for a 1-D signal f(i)f(i) .
i. What does F(5)F(5) tell you in terms of the image content in f(i)f(i) ?
ii. What is the reason that F(5)F(5) is less than 0 ?
(b) The Inverse Discrete Cosine Transform (IDCT) can be viewed as a reconstruction process in which the Cosine basis functions are scaled and added one by one to reconstruct f~(i)\tilde{f}(i) . Assume F(0)=500F(0)=500 and F(1)=70F(1)=-70 ; in the space provided at the bottom part of the figure,
i. draw f~(i)\tilde{f}(i) after only the F(0)F(0) is used,
ii. draw f~(i)\tilde{f}(i) after F(0)F(0) and F(1)F(1) are used.
Show all details including magnitudes, and explain why they look like that.
Not Answer
3
Suppose we have 32 samples from a mono audio file.
(a) If the audio was sampled at 8kHz8 \mathrm{kHz} , how many seconds does this represent?
(b) If we carry out a DCT with blocksize equal to 32, how many DCT coefficients do we end up with?
(c) Numbering the DCT coefficients from 0,
i) What frequency, in Hz\mathrm{Hz} , does coefficient number 1 represent?
ii) What frequency, in Hz\mathrm{Hz} , does coefficient number 16 represent?
(a) 32/8000sec==4msec32/8000 \mathrm{sec}==4 \mathrm{msec}
(b) 32
(c) i) 1/21/2 cycle per 32 samples
=1/2=1/2 cycle per 32/800032 / 8000 sec
=1/2/(32/8000)=1/2 /(32/8000) cycles-per-sec =4000/32=4000 / 32 cycles-per-sec =125 Hz=125 \mathrm{~Hz}
 (a)  32/8000 \mathrm{sec}==4 \mathrm{msec}  (b) 32 (c) i)  1/2  cycle per 32 samples  =1/2  cycle per  32 / 8000  sec  =1/2 /(32/8000)  cycles-per-sec  =4000 / 32  cycles-per-sec  =125 \mathrm{~Hz}    Fig. 8.1: DCT for a 1-D signal; reconstruction using only  F(0)  as opposed to only  F(0)  and  F(1) . ii)  =8  cycles per  32/8000 \mathrm{sec}   =8 /(32/8000)  cycles-per-sec  =2 \mathrm{kHz}  (Note that the max frequency we analyze for is  4 \mathrm{kHz} .) BEGIN not\_on\_exam:  \begin{aligned} & \# \text { S-plus language script. } \\ & \# \text { 1Ddct.s } \\ & \text { blocksize }=32 \\ & \text { pi }=3.14159265 \\ & \text { factor }=2.0 / \text { blocksize } \\ & \text { denom }=\left(2.0^{*} \text { blocksize }\right) / p i \\ & \text { Lambda }=\text { function }(u)\{ \\ & \text { if }(u==0) \text { return }(1 / \operatorname{sqrt}(2))\\ & \text{else return(1)}\\ & \}~\# \text{end of Lambda} \end{aligned}    END not on exam. Fig. 8.1: DCT for a 1-D signal; reconstruction using only F(0)F(0) as opposed to only F(0)F(0) and F(1)F(1) .
ii) =8=8 cycles per 32/8000sec32/8000 \mathrm{sec}
=8/(32/8000)=8 /(32/8000) cycles-per-sec =2kHz=2 \mathrm{kHz}
(Note that the max frequency we analyze for is 4kHz4 \mathrm{kHz} .)
BEGIN not\_on\_exam:
# S-plus language script. # 1Ddct.s  blocksize =32 pi =3.14159265 factor =2.0/ blocksize  denom =(2.0 blocksize )/pi Lambda = function (u){ if (u==0) return (1/sqrt(2))else return(1)} #end of Lambda\begin{aligned}& \# \text { S-plus language script. } \\& \# \text { 1Ddct.s } \\& \text { blocksize }=32 \\& \text { pi }=3.14159265 \\& \text { factor }=2.0 / \text { blocksize } \\& \text { denom }=\left(2.0^{*} \text { blocksize }\right) / p i \\& \text { Lambda }=\text { function }(u)\{ \\& \text { if }(u==0) \text { return }(1 / \operatorname{sqrt}(2))\\& \text{else return(1)}\\& \}~\# \text{end of Lambda}\end{aligned}
 (a)  32/8000 \mathrm{sec}==4 \mathrm{msec}  (b) 32 (c) i)  1/2  cycle per 32 samples  =1/2  cycle per  32 / 8000  sec  =1/2 /(32/8000)  cycles-per-sec  =4000 / 32  cycles-per-sec  =125 \mathrm{~Hz}    Fig. 8.1: DCT for a 1-D signal; reconstruction using only  F(0)  as opposed to only  F(0)  and  F(1) . ii)  =8  cycles per  32/8000 \mathrm{sec}   =8 /(32/8000)  cycles-per-sec  =2 \mathrm{kHz}  (Note that the max frequency we analyze for is  4 \mathrm{kHz} .) BEGIN not\_on\_exam:  \begin{aligned} & \# \text { S-plus language script. } \\ & \# \text { 1Ddct.s } \\ & \text { blocksize }=32 \\ & \text { pi }=3.14159265 \\ & \text { factor }=2.0 / \text { blocksize } \\ & \text { denom }=\left(2.0^{*} \text { blocksize }\right) / p i \\ & \text { Lambda }=\text { function }(u)\{ \\ & \text { if }(u==0) \text { return }(1 / \operatorname{sqrt}(2))\\ & \text{else return(1)}\\ & \}~\# \text{end of Lambda} \end{aligned}    END not on exam. END not on exam.
4
Another name for zig-zag coding is "zonal coding". Suppose you invent a new zonal coding scheme for JPEG that simply discards diagonals above the first few. Suppose we keep the first six zig-zag lines.
(a) How many coefficients are we keeping?
(b) How will we do, compared to keeping all the zig-zags (still using run-length encoding)? Comment on both compression capability and image quality.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 10 في هذه المجموعة.
فتح الحزمة
k this deck
5
Briefly explain why DCT was chosen for JPEG compression.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 10 في هذه المجموعة.
فتح الحزمة
k this deck
6
Why does JPEG give compression? At which points in the algorithm does compression occur? Please estimate (as a general guess) how much compression occurs at each of these points.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 10 في هذه المجموعة.
فتح الحزمة
k this deck
7
Explain why the block DCT is preferred to taking the whole image DCT in JPEG compression.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 10 في هذه المجموعة.
فتح الحزمة
k this deck
8
Describe how H.261 deals with temporal and spatial redundancies in video.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 10 في هذه المجموعة.
فتح الحزمة
k this deck
9
Given the following two consecutive frames in an H.261 video, if the first image is treated as an Iframe and the second a P-frame. Show in detail how the second image is encoded. (Work out the numbers at each step. You can assume any quantizer and stop after the RLC (Run-length Coding).)
To simplify the data, we assume that the size of macroblocks is 8×88 \times 8 instead of 16×1616 \times 16 .
Intensity Values for the First Image:
 Given the following two consecutive frames in an H.261 video, if the first image is treated as an Iframe and the second a P-frame. Show in detail how the second image is encoded. (Work out the numbers at each step. You can assume any quantizer and stop after the RLC (Run-length Coding).) To simplify the data, we assume that the size of macroblocks is  8 \times 8  instead of  16 \times 16 . Intensity Values for the First Image:   Intensity Values for the Second Image:
Intensity Values for the Second Image:
 Given the following two consecutive frames in an H.261 video, if the first image is treated as an Iframe and the second a P-frame. Show in detail how the second image is encoded. (Work out the numbers at each step. You can assume any quantizer and stop after the RLC (Run-length Coding).) To simplify the data, we assume that the size of macroblocks is  8 \times 8  instead of  16 \times 16 . Intensity Values for the First Image:   Intensity Values for the Second Image:
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 10 في هذه المجموعة.
فتح الحزمة
k this deck
10
We have seen that a logarithmic-based block search strategy is fast: it is "suboptimal but still usually effective." That being the case, why should we be bothered carrying out a full search in any situation? Why not just stick to the fast method - what is the advantage to be gained, if any, from a comprehensive, optimal search strategy?
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 10 في هذه المجموعة.
فتح الحزمة
k this deck
locked card icon
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 10 في هذه المجموعة.