Bihan Wen

  • Advisor:
      • Yoram Bresler
  • Departments:
  • Areas of Expertise:
      • Compressed Sensing
      • Big Data
      • Biomedical Imaging
      • Sparse / Low-rank Representation
      • Machine Learning
      • Computer Vision
      • Image/Video Processing
  • Thesis Title:
      • Transform learning based image and video processing
  • Thesis abstract:
      • In recent years, sparse signal modeling, especially using the synthesis dictionary model, has received much attention. Sparse coding in the synthesis model is, however, NP-hard. Various methods have been proposed to learn such synthesis dictionaries from data. Numerous applications such as image denoising, magnetic resonance image (MRI), and computed tomography (CT) reconstruction have been shown to benefit from a good adaptive sparse model. Recently, the sparsifying transform model has received interest, for which sparse coding is cheap and exact, and learning, or data-driven adaptation admits computationally efficient solutions. In this thesis, we present two extensions to the transform learning framework, and some applications. In the first part of this thesis, we propose a union of sparsifying transforms model. Sparse coding in this model reduces to a form of clustering. The proposed model is also equivalent to a structured overcomplete sparsifying transform model with block cosparsity, dubbed OCTOBOS. The alternating algorithm introduced for learning such transforms involves simple closed-form solutions. Theoretical analysis provides a convergence guarantee for this algorithm. It is shown to be globally convergent to the set of partial minimizers of the non-convex learning problem. When applied to images, the algorithm learns a collection of well-conditioned square transforms, and a good clustering of patches or textures. The resulting sparse representations for the images are better than those obtained with a single learned transform, or with analytical transforms. We show the promising performance of the proposed approach in image denoising, which compares quite favorably with approaches involving a single learned square transform or an overcomplete synthesis dictionary, or Gaussian mixture models. The proposed denoising method is also faster than the synthesis dictionary based approach. Next, we develop a methodology for online learning of square sparsifying transforms. Such online learning can be particularly useful when dealing with big data, and for signal processing applications such as real-time sparse representation and denoising. The proposed transform learning algorithms are shown to have a significantly lower computational cost than online synthesis dictionary learning. In practice, the sequential learning of a sparsifying transform typically converges faster than batch mode transform learning. Preliminary experiments show the usefulness of the proposed schemes for sparse representation, and denoising. In the third part, we present a video denoising framework based on online 3D sparsifying transform learning. The proposed scheme has low computational and memory costs, and can handle streaming video. Our numerical experiments show promising performance for the proposed video denoising method compared to popular prior or state-of-the-art methods.
  • Downloads:

Contact information: