Content-based Image Sequence Representation

Three dimensional video representations: We developed algorithms that process monocular video sequences and extract 3D models of rigid objects present in the scene. The shape of the object is described by patches, e.g., planar patches, or, more generally, polynomial patches. Our algorithms factor a rank one matrix to obtain the rigid 3D shape and the rigid 3D motions of the objects. Besides several papers describing this work, see for example Aguiar and Moura [2001,2003], part of this work was patented, see “System and Method for Generating a Three-dimensional Model from a Two-Dimensional Image Sequence,” (allowed by the the United States patent Office in February 2004). Earlier work fused texture information from a monocular video sequence with range laser measurements to build textured 3D models of objects, see Martins and Moura.

Modeling human motion: We extended generative video (see below) to capturing and modeling video sequences with human walkers, see our papers Cheng and Moura [1999,1998].

Generative video: Our work is concerned with developing representations for video sequences based on their content. These representations differ from those developed for MPEG/H.26X coding standards in that sequences are described in terms of extended images instead of collections of frames. We describe how these extended images, e.g., mosaics, are generated by basically the same principle: the incremental composition of visual photometric, geometric, and multi-view information into one or more extended images. Different outputs, e.g., from single 2-D mosaics to full 3-D mosaics, are obtained depending on the quality and quantity of photometric, geometric, and multi-view information. In particular, we developed a framework–generative video–that is well suited to the representation of scenes with independently moving objects. Content-based video representations can potentially provide compression ratios that are in the range of 1000:1 with acceptable quality. This work is the subject of several papers, in particular a comprehensive description is in the CRC Chapter [2004], and in the papers [1996,1995a,1995b]. Part of the work is patented, see “Generative Video: Very Low Bit Rate Video Compression.”

Video over wireless: We demonstrated the high compression at good quality provided by generative video with a very early (in 1995) demonstration of transmitting video over wireless. The wireless network interfaced two highly heterogeneous wireless networks – a ‘fast’ 2 Mbps local area network (CMU’s wireless andrew) and a slow metropolitan area ‘slow’ 19.6 Kbps wireless network. The results were reported in [1996], in an invited paper published by the IEEE Personal Communications Magazine.

Predictive lossy compression: Our work developed noncausal random field models to describe the image texture and then developed novel predictive coding algorithms that compared very favorably with transform based coders, see for example the patent “Noncausal Predictive Image Codec,” or the papers Balram and Moura [19961993], Moura and Balram [1992], and Asif and Moura [1996].

Book Chapters (additional chapters in Book Chapters)

  • Pedro M. Q. Aguiar, Radu Jasinschi, José M. F. Moura, and Charnchai Pluempitiwiriyawej, “Content-based Image Sequence Representation,” ed. Todd Reed, in Digital Image Sequence Processing: Compression and Analysis, CRC Press Handbook, in press, 2004 (61 pages). Invited Chapter.
  • José M. F. Moura and Nikhil Balram, “Statistical Algorithms for Noncausal Markov Random Fields,” in Handbook of Statistics, edts. N. K. Bose and C. R. Rao, Chapter 15, North Holland, Amsterdam, The Netherlands, July 1993. Invited Chapter.

Selected Journal Papers (additional papers in  Publications)

Selected Conference Papers (additional papers in Conference Publications)

  • Pedro M. Q. Aguiar and José M. F. Moura, “3D Rigid Structure from Video: What Are “Easy” Shapes and “Good” Motions?,” 5th IEEE Workshop on Multimedia Signal Processing, September 2002.
  • Amir Asif and José M. F. Moura, “Fast Inversion of L-Block Banded Matrices and their Inverses,” ICASSP’02, IEEE International Conference on Signal Processing, vol. 2, pp. 1369-1372, Orlando, FL, 12-17 May 2002.
  • Pedro M. Q. Aguiar and José M. F. Moura, “Image Motion Estimation – Convergence and Error Analysis,” ICIP ’01, IEEE Proceedings of International Conference on Image Processing, vol. 2, pp. 937-940, Greece, 7-10 October, 2001.
  • Pedro M. Q. Aguiar and José M. F. Moura, “Maximum Likelihood Estimation of the Template of a Rigid Moving Object,” EMMCVPR’01, Energy Minimization Methods in Computer Vision and Pattern Recognition, July 2001.
  • Pedro M. Q. Aguiar and José M. F. Moura, “Weighted Factorization,” ICIP ’00, IEEE Proceedings of International Conference on Image Processing, Vancouver, British Columbia Canada, October 2000.
  • Amir Asif and José M. F. Moura, “Block Banded Matrix Approximations: Application to Kalman-Bucy filter,” ICASSP’00, IEEE International Conference on Signal Processing, Istanbul, Turkey, June 2000.
  • Pedro M. Q. Aguiar and José M. F. Moura, “Fast 3D Modeling from Video,” 3th IEEE Workshop on Multimedia Signal Processing, Copenhagen, Denmark, December 1999.
  • Pedro M. Q. Aguiar and José M. F. Moura, “A Fast Algorithm for Rigid Structure from Image Sequences,” ICIP ’99, IEEE International Conference on Image Processing, Yokoama, Japan, October 1999.
  • Pedro M. Q. Aguiar and José M. F. Moura, “Maximum Likelihood Inference of 3D Structure from Image Sequences,” EMMCVPR’99, Energy Minimization Methods in Computer Vision and Pattern Recognition, July 1999.
  • Pedro M. Q. Aguiar and José M. F. Moura, “Factorization as a Rank 1 Problem,” CVPR’99, Computer Vision and Pattern Recognition Conference, June 1999.
  • Jia-Ching Cheng and José M. F. Moura, “Capture and Synthesis of Human Motion in Video Sequences,” 2nd IEEE Workshop on Multimedia Signal Processing, Santa Monica, CA, December 7-9, 1998.
  • Pedro. M. Q. Aguiar and José M. F. Moura, “Video Representation via 3D Shaped Mosaics.” ICIP ’98, IEEE Proceedings of International Conference on Image Processing, Chicago, Illinois, October 1998.
  • Pedro M. Q. Aguiar and José M. F. Moura, “Robust 3D Structure From Motion Under Orthography,” IEEE Image and Multidimensional Digital Signal Processing Workshop’98, Vienna, Austria, July 1998.
  • Pedro Aguiar and José M. F. Moura, “Detecting and Solving Template Ambiguities in Motion Segmentation,” ICIP’97,IEEE Proceedings of International Conference on Image ProcessingSanta Barbara, CA, October 1997.
  • Jia-Ching Cheng and José M. F. MouraTracking Human Walking in Dynamic Scenes.” ICIP’97, IEEE Proceedings of International Conference on Image Processing, Vol. I, pp. 137-140, Santa Barbara, CA, October 1997.
  • Jia-Ching Cheng and José M. F. Moura, “Model Based Recognition of Human Walking in Dynamic Scenes.” 1 st IEEE Workshop on Multimedia Signal Processing, pp. 268-273, Princeton, NJ, June 1997.
  • Amir Asif and José M. F. Moura, “Fast Recursive Reconstruction of Large Time Varying Multidimensional Fields,” ICASSP ‘97IEEE International Conference on Signal ProcessingVolume IV, pages 3037-3039, Munich, Germany, April 1997.
  • Fernando Martins, Hirohisa Shiojiri and José M. F. Moura, “3D-3D Registration of Free Formed Objects Using Shape and Texture,” VCIP ‘97Conference on Visual Communications and Image Processing, SPIE Symposium on Electronic Imaging, Science and Technology, San Jose, CA, February 1997.
  • Pedro Aguiar and José M. F. Moura, “Incremental Motion Segmentation in Low Texture,” ICIP’96, IEEE International Conference on Image Processing, Lausanne, Switzerland, September 1996.
  • Radu S. Jasinschi and José M. F. Moura, “Nonlinear Editing by Generative Video,” in ICASSP’96, IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. II, pp. 1220-1223, Atlanta, GA, May 1996. Special Session on Video Processing by Contents.
  • José M. F. Moura, R. S. Jasinschi, H. Shiojiri, and J-C. Lin, “Video Over Heterogeneous Networks,” in Proceedings of SPIE “Wireless Personal Communications Technologies and Services, vol. SPIE-2603, Invited paper in Session “PCS Data and Multimedia Applications.” Philadelphia, PA, October 1995.
  • F. Martins and José M. F. Moura, “3-D Video Compositing: Towards a Compact Representation for Video Sequences,” in ICIP’95, Proceedings of International Conference on Image Processing, vol. 1, pp. 550-553. IEEE, October 1995.
  • R. S. Jasinschi and José M. F. Moura, “Content-Based Video Sequence Representation,” in ICIP’95, Proceedings of International Conference on Image Processing, vol. 2, pp. 229-232. IEEE, October 1995.
  • R. S. Jasinschi, J. M. F. Moura, J-C. Cheng, and A. Asif, “Video Compression Via Constructs,” in ICASSP ’95, IEEE International Conference on Acoustics, Speech, and SignaProcessing, vol. IV, pp. IV-2165-2168, Detroit, MI, May 8-12, 1995.
  • A. Asif and José M. F. Moura, “Assimilation of Satellite Data in Beta-PlaneOcean Circulation Models,” ICASSP ’95IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. V, pp. V-2789-2792, Detroit, MI, May 1995. Special Session on Signal Processing in the Ocean Environment.
  • Nikhil Balram and Jose´ M. F. Moura, “Predictive Coding Using Noncausal Models,” ICASSP ‘93, IEEE International Conference on Acoustics, Speech, and Signal Processing, April 1993.
  • José M. F. Moura and Nikhil Balram, “2D Linear Optimal Statistical Signal Processing on Finite Lattices,” in Acoustic Signal Processing for Ocean Exploration, edts. José M. F. Moura and Isabel M. G. Lourtie, pp. 413-432, D. Reidel, Amsterdam, The Netherlands, January 1993.
  • José M. F. Moura and Nikhil Balram, “Optimal Prdecitive Coding of 2D Fields,” in ISIT’93IEEE International Symposium on Information Theory, San Antonio, TX, January 1993.
  • H. Tokuda, Y. Tobe, S. T.-C Chou, and J. M. F Moura. “Continuous media communication with dynamic QOS control using artss with an FDDI network.” In Proceedings of the ACM Symposium on Communications Architectures and Protocols (ACM SIGCOM/ 92 ), pages 88-98, Baltimore, October 1992.
  • Nikhil Balram and José M. F. Moura, “Parameter Estimation in 2D Fields,” ICASSP ‘92, IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 3, pp. 345-348, March 1992.
  • Nikhil Balram and José M. F. Moura, “Rapid Enhancement and Compression of Image Data,” SPIE VCIP ’91, Boston, MA, November 1991.
  • Nikhil Balram and José M. F. Moura, “Recursive Enhancement of Noncausal Images,” ICASSP ’91IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 4, pp. 2997-3000, May 1991.
  • Nikhil Balram and José M. F. Moura, “Parameter Estimation for Noncausal Gauss Markov Random Fields,” CISS’91, 25 th Annual Conference on Information Sciences and Systems, pp. 365-370, Baltimore, MD, March 1991.

Patents and disclosures

  • Generative Video: Very Low Bit Rate Video Compression,” José M. F. Moura and Radu S. Jasinschi, US Patent and Trademark Office, S.N. 5,854,856, issued December 29, 1998.
  • Noncausal Predictive Image Codec,” Nikhil Balram and José M. F. Moura, US Patent and Trademark Office, S.N. 5,689,591, issued November 18, 1997.
  • “System and Method for Generating a Three-dimensional Model from a Two-Dimensional Image Sequence,” Pedro M. Q. Aguiar and José M. F. Moura, provisional patent filed July 1999; patent filed with US Patent and Trademark Office, Serial Number 09/614,841, July 12, 2000. Notice of allowance 2/23/2004.

Shape in Image Processing

Shapes provide a rich set of clues on the identity and topological properties of an object. In many imaging environments, however, the same object appears to have different shapes due to such distortions as translation, rotation, reflection, anisotropic scaling, skewing, or shearing. These distortions are generally captured by affine transformations. Further, the order by which the object’s feature points are scanned changes. we refer to these as permutation distortions. We show below on the left two images of the same airplane that are affine distorted. When scanned, say lexicographically (top to bottom, left to right), the pixels are not in correspondence.

planes

To relate shapes like these, i.e., of the same object and that are distorted by different affine and permutation transformations is a challenge. Overcoming the permutation distortions, i.e., the unknown scanning order, is combinatorial–the correspondence problem. Our work is concerned with developing algorithms that are invariant to these affine-permutation distortions. We have introduced the concept of intrinsic shape of an object. It is a uniquely defined representative of the equivalence class of all affine-permuted distortions of the same object. The shape of the object is essentially the shape that results after we factor out the distortions. The figure of the airplane above on the right is the intrinsic shape of the two distorted images on the left as obtained by the BLAISER, a blind algorithm described in the references below by Ha and Moura. The distortions are interpreted as actions of the group of distortions (affine-permutation group as a subgroup of the general linear group) on the space of configurations (distorted shapes). We developed a blind algorithm that recovers the intrinsic shape from any arbitrarily unknown affine-permutation distorted image of the object. We are pursuing the definition of shape space and studying the geometry of this shape space, for example, the notions of distance and geodesics in shape space.

Selected Journal Papers (additional papers in Journal Publications)

  • Victor Ha and José M. F. Moura, “Affine-permutation Invariance of 2D Shapes,” IEEE Transactions on Image Processing, 14(11), pp. 1687-1700, 2005.

Selected Conference Papers (additional papers in Conference Publications)

  • Victor Ha and José M. F. Moura, “Robust Reorientation of 2D Shapes Using the Orientation Indicator Index,” ICASSP’05IEEE International Conference on Signal ProcessingPhiladelphia, PA, March 18-23, 2005.
  • Victor Ha and José M. F. Moura, “Three-dimensional Intrinsic Shapes,” ICIP’04, IEEE International Symposium on Image Processing, Singapore, October 24-27, 2004.
  • David Sepaishvili, José M. F. Moura, and Vitor Ha, “Affine Permutation Symmetry: Invariance and Shape Space,” IEEE Workshop on Statistical Signal Processing, St. Louis, MI, Spetmber 2003.
  • Victor Ha and José M. F. Moura, “Efficient 2D Shape Orientation,” ICIP’03, IEEE International Conference on Image Processing, Barcelona, Spain, September 2003.
  • Viktor Ha and José M. F. Moura, “Intrinsic Shape,” 36th Asilomar Conference on Signals, Systems, and Communications, vol.2: 993-997, Monterey, CA, November 2002. Invited paper, Special Session on Statistical Image Processing.
  • Victor H. Ha and José M. F. Moura, “Affine Invariant Wavelet Transform,” ICASSP’01, IEEE International Conference on Signal Processing, vol. 3, 1937-1940, Salt Lake City, Utah, May 2001.