3d_wavelet

所属分类:波变换
开发工具:Visual C++
文件大小:6706KB
下载次数:121
上传日期:2006-05-19 00:55:44
上 传 者riverzjs
说明:  一个小波变换编码的图像压缩算法,可以学习小波编码相关方法
(a Wavelet Transform Coding Image Compression Algorithm, we can learn from wavelet coding method)

文件列表:
3d_wavelet\vs4_distribution\Announcement (5860, 1994-08-30)
3d_wavelet\vs4_distribution\graph_scales_info.c (15492, 1994-08-25)
3d_wavelet\vs4_distribution\Makefile (4338, 1994-08-25)
3d_wavelet\vs4_distribution\make_scales_info.c (8201, 1994-08-25)
3d_wavelet\vs4_distribution\receive.c (31647, 1994-08-25)
3d_wavelet\vs4_distribution\trace_to_tags.c (24561, 1994-08-25)
3d_wavelet\vs4_distribution\transmit.c (18012, 1994-08-25)
3d_wavelet\vs4_distribution\subband\.pure (0, 1994-08-25)
3d_wavelet\vs4_distribution\subband\analysis.c (45624, 1994-08-25)
3d_wavelet\vs4_distribution\subband\subband.h (1270, 1994-08-25)
3d_wavelet\vs4_distribution\subband\synthesis.c (50613, 1994-08-25)
3d_wavelet\vs4_distribution\subband (0, 2004-03-02)
3d_wavelet\vs4_distribution\structure\.pure (0, 1994-08-25)
3d_wavelet\vs4_distribution\structure\lex.yy.c (17877, 1994-08-25)
3d_wavelet\vs4_distribution\structure\structure.c (54341, 1994-09-21)
3d_wavelet\vs4_distribution\structure\structure.h (21885, 1994-08-25)
3d_wavelet\vs4_distribution\structure\structure.lex (3585, 1994-08-25)
3d_wavelet\vs4_distribution\structure\structure.pdf (99172, 2001-11-29)
3d_wavelet\vs4_distribution\structure\token.h (1233, 1994-08-25)
3d_wavelet\vs4_distribution\structure (0, 2004-03-02)
3d_wavelet\vs4_distribution\statistics\.pure (0, 1994-08-25)
3d_wavelet\vs4_distribution\statistics\statistics.c (18419, 1994-08-25)
3d_wavelet\vs4_distribution\statistics\statistics.h (2852, 1994-08-25)
3d_wavelet\vs4_distribution\statistics (0, 2004-03-02)
3d_wavelet\vs4_distribution\pan\.pure (0, 1994-08-25)
3d_wavelet\vs4_distribution\pan\apply_pan.c (32568, 1994-08-25)
3d_wavelet\vs4_distribution\pan\fft.c (5259, 1994-08-25)
3d_wavelet\vs4_distribution\pan\fft_dp.c (5282, 1994-08-25)
3d_wavelet\vs4_distribution\pan\get_pan.c (29531, 1994-09-21)
3d_wavelet\vs4_distribution\pan\pan.h (2838, 1994-08-25)
3d_wavelet\vs4_distribution\pan (0, 2004-03-02)
3d_wavelet\vs4_distribution\packets\.pure (0, 1994-08-25)
3d_wavelet\vs4_distribution\packets\packets.c (2982, 1994-08-25)
3d_wavelet\vs4_distribution\packets\packets.h (22372, 1994-08-25)
3d_wavelet\vs4_distribution\packets\packets.pdf (134637, 2001-11-29)
3d_wavelet\vs4_distribution\packets\packets_in.c (68650, 1994-08-25)
3d_wavelet\vs4_distribution\packets\packets_out.c (89485, 1994-08-25)
3d_wavelet\vs4_distribution\packets (0, 2004-03-02)
3d_wavelet\vs4_distribution\memory\.pure (0, 1994-08-25)
... ...

README for DWIT video codec. Version 2.0 LICENSE DWIT video codec. Copyright (C) 1999,2000,2001 Peter rbk This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. DESCRIPTION The DWIT video codec is a differential software video codec based on integer wavelet transformation. The tcontrol program couples the codec with IP multicast and unicast transmission capability, as well as a video mixer. This software is developed on an SGI Octane with a Personal Video board (EVO) running IRIX 6.5.10m. It has also been known to run on the O2 (IRIX 6.5.4m) with the MVP video board. Version 2.0 has been ported to SuSE Linux 7.0 (i386, kernel 2.2.16) with video4linux. The code also runs in receive-only mode with display in one or more XWindows on the following architectures: - Linux 2.2 x86 - Solaris 2.6 SPARC - HP-UX 10.20 PA-RISC Video input is taken from the default video input node as set through the video panel. Video output likewise goes to the default video output node. The Linux port does not yet support analog video output. This software is intended to be used for mbone conferencing, especially for connecting classrooms. The audio side of the conference is to be taken care of by the standard mbone tools like vat or rat, or something else. REQUIREMENTS - SGI Octane or O2 running IRIX 6.5 or later or a Linux box with a frame grabber card supported by video4linux for sending. A Linux, Solaris or HP-UX box is adequate for reception only. - Video I/O board (Personal Video or MVP). - SGI C++ compiler or GCC 2.8.1 or later. (The SGI CC is better). - The SGI C compiler must be present for run-time code generation. - The C++ STL v3 or later. May be downloaded from http://www.sgi.com/Technology/STL/download.html - The SGI VL and DMedia libraries (on SGIs). - X11R6 (Xlib and Xext). - GNU Make The code may run on other SGI machines/video boards, but this is untested. You need at least IRIX 6.3 in all cases. INSTALLATION # edit the Makefile to suit your OS gmake depend # optional gmake INVOKING THE CODEC The codec is started by the command: tcontrol [options] addr/port/ttl where addr is a DNS hostname or an IP address in dotted quad decimal format. The address may be either unicast or multicast. Port is the UDP port number to use for the SCP messages. The value of port + 1 will be used for the video data stream. Ttl is the time-to-live of the video data packets. COMMAND LINE OPTIONS -h Set grabbing and transmission resolution. = 0: 128x96. Grabs fields and zooms. = 1: 192x128. Grabs fields and zooms. = 2: 256x192. Grabs fields. = 3: 320x224. Grabs fields. [default] = 4: 512x384. Grabs frames. = 5: ***0x448. Grabs frames. = 6: 352x288. CIF format = 7: 176x144. QCIF, grabs fields & zooms = 8: 288x208. Grabs fields This only affects the resolution of the video /sent/ by this node. -D Set the SCP (Session Control Protocol) delay counter to [default = 500]. This determines how often to send SCP messages. -g = 1: enable frame grabbing [default]. = 0: disable frame grabbing for receive-only operation. -d = 1: enable video-out display [default]. = 0: disable video-out display for send-only operation. -s = 1: enable network sends [default]. = 0: disable network sends for receive-only operation. -r = 1: enable network reception [default]. = 0: disable network reception for send-only operation. -v = 1: enable viewing local video cut-through. = 0: disable viewing of local video [default]. When used with the -w option local video is also shown in a separate window. -f Set desired frame rate where is the frame rate in frames per second. 1 <= <= 30. It may take up to 20 seconds before the desired frame rate is approximated. The frame rate is of course also limited by the performance of the system, and at high resolutions the actual frame rate may be much lower than desired. [default = 20]. -b 1 <= <= 4: the number of the least significant resolution layer to send. Larger numbers give better quality and less compression. [default 4]. Four or five seems to be a good choice. The largest allowable value for this is one more than the depth of the wavelet transform pyramid, equal to the log_2 of the block side length. -m 1 <= <= 6: set the number of the least significant bits to send from the transformed picture [default = 3]. Larger numbers give poorer quality and better compression. This only affects the network video sent from this node, it may still receive video at any other quality from other nodes. -B NOT CURRENTLY OPERABLE. Enable bandwidth adaptation, with an upper limit of kbps. The codec will adjust its current minbit (-m) and botlevel (-b) parameters to keep the sending bandwidth use below, but close to kbps. The default is no bandwidth limitation and quality adaptation. -F Set the temporal filter limit for the low-frequency filter, determines the break from quadratic growth to the identity function. Experimental. [default = 7]. -H Set the temporal filter limit for the high-frequency filter, determines the break from quadratic growth to the identity function. Experimental. [default = 7]. This can be used as a noise filter. -T Set the change prediction threshold. Higher values means more motion is necessary before a difference block is transmitted, ie. less bandwidth is used. This is very useful to put a cap on the bandwidth in the face of noisy camera signals. Too high values (>20) induces artifacts. Setting the value to 0 turns off change prediction. Defaults to 12. -u Set the username for this conference node to . Corresponds to the arguments to -P options at other nodes. -e Set the e-mail contact address for this node. -l Set the local video output mixer layout to : 0: one source filling the whole frame [default] 1: two sources side by side. 2: four sources of the same size. 3: four sources with the first being larger. 4: nine sources of equal size. 5: four sources, one large at top, three smaller below. -N Set the layout number for the next -P option. This determines where a network video source is shown in multi-source output. -P Put network video from the node named (its username) into the output window numbered by the previous -N option. -L Put the local video into the video output window number . Defaults to 0 (when -v1 is given). -w Enable video output in a number of windows. For X11 the codec uses shared memory for image transfer if the MIT-SHM extension is available. A small video window for each source will be displayed. No in-window mixing is done currently. Currently the X display is in greyscale on an 8 bit PseudoColor visual, and in color on a 24 bit TrueColor visual. The flags can be used to configure the window system interface. Currently two flags for X11 are supported. "1" disables the use of the MIT-SHM shared memory extension. This may be required when trying to run the display on a remote machine. This is NOT recommended! A small X video window may easily consume 8 or more megabits per second!! The second flag "2" enables 2x upscaling of video shown in X windows. Examples: -w0 : enable X display, use MIT-SHM if possible -w1: enable X display, never attempt to use MIT-SHM -w2: enable upscaled X display, use MIT-SHM if possible -w3: enable upscaled X display, never attempt MIT-SHM Note. For XWindows display under IRIX it may be an advantage to: setenv DISPLAY shm:0 to take advantage of a shared memory connection to the X server. EXAMPLES A first test. o Attach a camera to the O2 or Octane (the digital O2Cam, or a real video camera). o Hook up a video display monitor to the video-out port of the computer. A TV can be used for this. o In the video-panel (vcp) set the default input to the camera, and the default video output to the video-out port. Make sure to select the right format (NTSC or PAL) as appropriate for camera and monitor. o invoke the command: tcontrol -l2 -v1 localhost/7777/7 this should display 2 video images on the TV monitor: one directly grabbed from the camera, and one coded and decoded. More examples. Show greyscale video under X. Does not require an external TV/monitor. tcontrol -w0 -v1 localhost/7777/7 Send medium-res video at a good quality, showing the other party /only/ in full screen format: tcontrol -h2 -m2 -b5 -l0 10.0.34.34/3333/16 Loop-back, showing local uncompressed video beside compressed and uncompressed looped back video: tcontrol -v1 -h3 -m2 -l1 localhost/7777/77 Multicast video at low resolution, show local video in a small window, and a specfic source (Primary) in the larger window: tcontrol -v1 -h0 -m2 -l3 -L3 -N0 -P Primary 224.45.45.45/3333/16 Send-only operation without any local display: tcontrol -h1 -m2 -d0 -v0 -r0 224.45.45.46/7777/77 Bandwidth use varies with resolution, quality, and the amount of motion in the picture. Good quality at high resolution requires about 900 kbps at 15 fps. So this is not designed for modem users. Low quality at low resolution may require about 100 kbps at 30 fps. You may use the -B option to make the codec dynamically adapt the quality to a given bitrate. COMMAND LINE OPTIONS FOR THE REMOTE CONTROL The network remote control program is a Perl 5 script invoked as rcontrol.pl [-h host] [-p port] {l|s|m|b|f|u|x|F|H|B} a [b] The remote control protocol runs over UDP and may hence be unreliable. The command line options are: -h The hostname of the codec to send control packets to. A DNS name or a dotted quad IP address. Defaults to localhost. -p The remote control port number. Defaults to UDP port 8463. l Sets the layout of the remote node. As the -l option of tcontrol. s Swaps sources for windows a and b at the remote node. m Sets the minbits QoS measure at the remote node. As the -m option of tcontrol. b Sets the bitlevel QoS measure at the remote node. As the -b option of tcontrol. f Sets the sending framerate in fps at the remote node. As the -f option of tcontrol. u Maps the named source to slot at the remote node. is a string as given to a -u option of another tcontrol node. x Swaps the named source with the slot number at the remote node. p "Ping" the server, returning status information such as framerate, bitrate, sizes and options. F Set limit of the low frequency noise filter. H Set limit of the high frequency noise filter. B Set (and enable) the bandwidth limiter. If is 0 then the bandwidth limiting is disabled. PRINCIPLES OF OPERATION Below is an overview of the transmission and reception process. . Grab a frame . Reformat YUV 422-interleaved to 16 bit planar . Wavelet transform (decompose) . Encode blocks keeping reference blocks . Transmit blocks via unicast or multicast . Receive blocks from network . Use and update reference blocks . Wavelet transform (reconstruct) . Reformat 16 planar to 422 interleaved and scale up/down. The coder initializes itself to continouosly grab frames from the camera. When a frame is grabbed in interleaved 4:2:2 YUV format (8 bits per component), it is reformatted into separate Y, U, and V planes at 16 bit per component. The frame is then wavelet transformed (decomposed) using the (5,3) integer wavelet transform of Fisher, Shao and Hua [1997]. The component planes are laid out as shown below, and transformed as one plane. A three level transformation is done (i.e. the basic transform is applied first to the raw data, then to the low frequency components of the transformed results, and so on, forming a 3 level pyramid.) +---------+ | Y | | | +----+----+ | U | V | | | | +----+----+ After transformation, the plane is divided into 8x8 blocks, which are scheduled for transmission according to a random pattern. The block size corresponds to the number of decomposition levels, so each block contains one pyramid. However, the higher levels of a pyramid contains information derived from adjacent blocks as well. In order to do temporal compression, difference blocks between a reference frame and the new frame may be transmitted, if the age of the block is below a certain threshold (a given block may only be transmitted as a difference a certain number to times in a row, this helps recovery from packet loss). The coder may also choose to send a reference block if the encoding of the previous difference block was larger than the last reference block. For blocks that have changed much since the last reference block (because of motion), it is beneficial to send a new reference block earlier than it would otherwise have been sent. All blocks do not start out at the same age. This distributes the transmission of reference blocks across many frames. This helps avoid bandwidth spikes due to reference frame transmission. Blocks are encoded using a zero-tree based scheme inspired by the SPIHT algorithm by Said and Pearlman [1996]. However, whereas SPIHT is fully embedded and always transmits one bit per tree node at a time, our algorithm transmits all the desired bits for a node at once. This makes the algorithm non-embedded, but since we assume that entire packets are wholly lost or wholly transmitted (as is typical on a packet network) this does not matter. The reason for the different algorithm, is that I didn't succeed in making an implementation of SPIHT fast enough. The algorithm used in DWIT does not have to employ linked lists for example. See the bottom of codec.cc for the algorithm. A central part of the algorithm consists of finding, for each subtree of a pyramid, the number of significant bits therein. Linear code for this computation is generated during program building in the maximizer.cc file. Compression rate versus quality and framerate can be adjusted by setting the number of the least significant bit to send (-m), as well as setting the lowest level of the pyramid to send (-b). COPING WITH JITTER Cheap cameras may provide a signal that jitters in intensity and color: while the camera is viewing a constant static scene, it may send back images with small fast variations in intensity and color. The visual effects of this jitter is amplified by the wavelet compression process (a small variation between original frames may result in large variations after compression.) The DWIT codec implements a counter measure to this jitter in the form of a temporal filter. Per pixel differences between consecutive frames are measured, and if they are below a certain threshold they are mapped through a quadratic function, so a to make small differences much smaller. Differences above the threshold are sent through as is. This temporal filter is inserted *after* wavelet decomposition, as this yielded the best qualitative results. Traditional video codec algorithms cope with the jitter during the block motion estimation and compensation phase. DWIT does not do block based motion estimation in order to avoid blocking artifacts at low bandwidths. ACKNOWLEDGEMENT This software has been developed at the University of Aarhus, Denmark, and partially funded by the Danish National Centre for IT-Research. Neither the University of Aarhus nor the Danish National Centre for IT-Research assumes any responsibility for this code what so ever. Frequently Asked Questions. --------------------------- Q. The local video image does not appear. A. Check the following: - is the -v1 command line option used. - is the -wX option used to show video in a window? - for video output: does the video layout support more than one source (-l1 and above does). - is the correct default video input selected in the videopanel? - is the correct video format for the input selected in videopanel? - is the camera connected and switched on, to what input connector? - is the video monitor switced on and connected to the right output? Make sure all of the above is checked. You may need to switch on the camera and restart the codec. Sometimes the SGI will not synchronize to the video signal when the camera is switched on. Restarting the codec while the camera sends video may help.

近期下载者

相关文件


收藏者