COmpression et REprésentation des Signaux Audiovisuels>

FR EN

Tutorials

Multimedia Content Protection: From Data Hiding to Encryption, Including Obfuscation and Secret Sharing

An increasing amount of multimedia data—such as images, videos, and 3D content—is transmitted over digital networks, stored or shared in the cloud, and disseminated through social media platforms. Beyond the need to compress these large multimedia datasets, issues related to confidentiality, privacy, and sensitive information are becoming more critical. It is therefore increasingly necessary to protect the multimedia content itself, not just access to the networks through which it is transmitted.

In this tutorial, after detailing the specific characteristics of different types of multimedia data—both in terms of compression and protection—we will present the various approaches available for protecting such content. These methods will be illustrated through a range of applications, from medical imaging to the Metaverse, including manufacturing industries such as fashion and drone video.

The first part will focus on data hiding techniques, distinguishing between watermarking and steganography, and will conclude with a discussion on steganalysis. The second part will cover aspects of cryptography applied to multimedia content, differentiating between selective encryption and partial encryption, and will conclude with crypto-compression techniques (for images, video, and 3D objects). The third part will introduce various image obfuscation methods, whether reversible or irreversible, visible or invisible. We will see that invisible image obfuscation primarily relies on the generation of fake images. Finally, we will address secret sharing methods applied to images.

To conclude, we will also discuss the “cat-and-mouse” game between attackers and defenders, highlighting adversarial attacks and defenses that must be taken into account.

William Puech received his engineering degree in electrical engineering from the University of Montpellier, France (1991), and a Ph.D. in Signal, Image, and Speech Processing from the Institut National Polytechnique de Grenoble, France (1997), with research focused on image processing and computer vision. He was a visiting associate researcher at the University of Thessaloniki, Greece. From 1997 to 2008, he was an Associate Professor at the University of Montpellier, where he has been a Full Professor in image processing since 2009. His current research interests include image forensics and security for secure transmission, storage, and visualization, combining data hiding, compression, cryptography, and machine learning. He leads the ICAR (Image and Interaction) team within LIRMM and has published more than 50 journal articles and 160 conference papers. He serves as an Associate Editor for four journals (SPIC, SP, JVCIR, and IEEE TDSC) in the fields of image forensics and security, and as a Senior Editor for IEEE TIFS. Since 2017, he has been Chair of the French Chapter of the IEEE Signal Processing Society. He was a member of the IEEE Information Forensics and Security Technical Committee (TC) from 2018 to 2020, and again since 2022. Since 2021, he has also been a member of the IEEE Image, Video, and Multidimensional Signal Processing Technical Committee.

Ultra-Low Bitrate Video Conferencing with Generative Face Video Coding: From Research to Standardization.

Video conferencing applications constitute an important portion of Internet video traffic, which has significantly increased in the past few years with the global pandemic. Current video conferencing systems relies on conventional advanced video compression standards such as H.264, HEVC or VVC. However, despite over three decades of refinement and optimization, these codecs still struggle to deliver satisfactory performance at extremely low bitrates. In scenarios where bandwidth is severely constrained, such as in congested networks or areas with weak radio coverage, the resulting video quality becomes unacceptable (loss of facial details), degrading the video conferencing experience significantly. Generative Face Video Coding (GFVC) architectures, pioneered by recent advances in deep learning, have recently demonstrated a high potential to address the above issues. Such architectures process facial video data efficiently by employing generative modeling to represent and reconstruct facial video content in a compact form. Such process allows to drastically reducing bandwidth requirements while enhancing the visual quality of video conferencing applications, ultimately improving the user experience in video conferencing applications. This tutorial will give a complete overview of GFVC schemes, from recent research advances in the literature as well as current standardization activities.

Giuseppe Valenzise is a CNRS senior researcher (Directeur de Recherche) at Université Paris-Saclay, within the Laboratoire des Signaux et Systèmes (L2S). He is currently the Editor-in-Chief of the Journal on Image and Video Processing (Springer). He received his Ph.D. degree from Politecnico di Milano and joined the French Centre National de la Recherche Scientifique (CNRS) as a permanent researcher in 2012. At L2S, he led the Multimedia and Networking team from 2023 to 2025. His research interests span 2D and 3D image and video processing, including traditional and learning-based image and video compression, point clouds and 3D Gaussian Splatting, image and video quality assessment, high dynamic range imaging, and high dynamic range imaging, and machine-learning-based image and video analysis. In 2018, he received the EURASIP Early Career Award for his contributions to video coding and analysis. Giuseppe has extensive experience in scientific service and conference organization, regularly serving on the organizing and technical committees of major international conferences in multimedia and signal processing, including ICIP, ICASSP and ICME. He serves or has served as Associate Editor for IEEE Transactions on Circuits and Systems for Video Technology, IEEE Transactions on Image Processing (Outstanding Editorial Board Member Award in 2022 and 2023), and Signal Processing: Image Communication. He was Chair of the Multimedia Signal Processing (MMSP) Technical Committee of the IEEE Signal Processing Society for the 2024-2025 term, and served as General Co-Chair of the IEEE International Conference on Multimedia & Expo (ICME) 2025.

Volumetric Video Technologies for Real-Time Immersive Communication

Immersive communication is emerging as a key application of extended reality technologies, aiming to enable natural remote interactions that go beyond traditional video conferencing. Volumetric video enables the capture and transmission of dynamic three-dimensional representations of people and scenes, allowing users to observe and interact with content from arbitrary viewpoints. Delivering such experiences in real time requires advances in both volumetric data acquisition and efficient data compression.

This tutorial will cover both the theoretical foundations and practical implementation aspects of volumetric media systems for immersive communication. It will compare capture technologies ranging from professional studio-based systems to consumer-oriented real-time solutions, and discuss their capabilities and constraints in terms of quality and performance. In addition, it will introduce volumetric compression methods, with emphasis on V-PCC and practical real-time encoder design, giving attendees insight into the trade-offs between theoretical coding efficiency and real-time operation.

Alexandre Mercat is a tenure-track Assistant Professor at Tampere University (TAU), Finland. He received his Ph.D. in electrical and computer engineering from INSA Rennes, France, in 2018, and was a Postdoctoral Researcher at TAU from 2018 to 2024. Within the Ultra Video Group (UVG), his research spans video coding, processing, streaming, energy- and complexity-aware design, and emerging volumetric formats, with a strong emphasis on open-source encoders and datasets. He has co-authored over 50 peer-reviewed publications, developed widely used open-source codecs and datasets, and received multiple Best Paper Awards, including at ACM MMSys and IEEE VCIP. He is a member of the IEEE Visual Signal Processing and Communications Technical Committee and co-founded the Insights from Negative Results track in the Journal of Signal Processing Systems.

Privacy | Accessibility