NEWS

  1. Jun 2026

    I will give an invited talk on threshold Dilithium at PACOH workshop in Korea. Come say hi if you're around!

  2. May 2026

    Our paper "Fast Homomorphic Linear Algebra with BLAS" was published in Journal of Cryptology.

  3. Apr 2026

    Our paper "THED: Threshold Dilithium from FHE" is available online.

  4. Feb 2026

    I gave an invited talk on threshold Dilithium at SNU.

  5. Jan 2026

    Our paper "Scaling up FHE-based Privacy-Preserving ML: Higher Throughput, Longer Inputs for LLama-3-8B" is available online.

  6. Jan 2026

    I have been promoted to Senior Researcher at CryptoLab.

    크립토랩의 책임연구원으로 승진했습니다.

Selected Publications

  1. Journal of Cryptology 2026

    Fast Homomorphic Linear Algebra with BLAS

    with Youngjin Bae, Jung Hee Cheon, Guillaume Hanrot, and Damien Stehlé

    Homomorphic encryption is a cryptographic paradigm allowing to compute on encrypted data, opening a wide range of applications in privacy-preserving data manipulation, notably in AI. Many of those applications require significant linear algebra computations (matrix-vector products, and matrix-matrix products). This central role of linear algebra computations goes far beyond homomorphic algebra and applies to most areas of scientific computing. This high versatility led, over time, to the development of a set of highly optimized routines, specified in 1979 under the name BLAS (basic linear algebra subroutines). Motivated both by the applicative importance of homomorphic linear algebra and the access to highly efficient implementations of cleartext linear algebra able to draw the most out of available hardware, we explore the connections between CKKS-based homomorphic linear algebra and floating-point plaintext linear algebra. The CKKS homomorphic encryption system is the most natural choice in this setting, as it natively handles real numbers and offers a large SIMD parallelism. We provide reductions for matrix-vector products, vector-vector products for moderate-sized to large matrices to their plaintext equivalents. Combined with BLAS, we demonstrate that the efficiency loss between CKKS-based encrypted square matrix multiplication and double-precision floating-point square matrix multiplication is a mere 4-12 factor, depending on the precise situation.

  2. Eurocrypt 2025

    Ciphertext-Ciphertext Matrix Multiplication: Fast for Large Matrices

    sole author

    Matrix multiplication of two encrypted matrices (CC-MM) is a key challenge for privacy-preserving machine learning applications. As modern machine learning models focus on scalability, fast CC-MM on large datasets is increasingly in demand. In this work, we present a CC-MM algorithm for large matrices. The algorithm consists of plaintext matrix multiplications (PP-MM) and ciphertext matrix transpose algorithms (C-MT). We propose a fast C-MT algorithm, which is computationally inexpensive compared to PP-MM. By leveraging high-performance BLAS libraries to optimize PP-MM, we implement large-scale CC-MM with substantial performance improvements. Furthermore, we propose lightweight algorithms, significantly reducing the key size from 1 960 MB to 1.57 MB for CC-MM with comparable efficiency. In a single-thread implementation, the C-MT algorithm takes 0.76 seconds to transpose a 2 048 × 2 048 encrypted matrix. The CC-MM algorithm requires 85.2 seconds to multiply two 4 096 × 4 096 encrypted matrices. For large matrices, our algorithm outperforms the state-of-theart CC-MM method from Jiang-Kim-Lauter-Song [CCS'18] by a factor of over 800.

  3. Crypto 2024

    Plaintext-Ciphertext Matrix Multiplication and FHE Bootstrapping: Fast and Fused

    with Youngjin Bae, Jung Hee Cheon, Guillaume Hanrot, and Damien Stehlé

    Homomorphically multiplying a plaintext matrix with a ciphertext matrix (PC-MM) is a central task for the private evaluation of transformers, commonly used for large language models. We provide several RLWE-based algorithms for PC-MM that consist of multiplications of plaintext matrices (PC-MM) and comparatively cheap pre-processing and post-processing steps: for small and large dimensions compared to the RLWE ring degree, and with and without precomputation. For the algorithms with precomputation, we show how to perform a PC-MM with a single floating-point PP-MM of the same dimensions. This is particularly meaningful for practical purposes as a floating-point PC-MM can be implemented using high-performance BLAS libraries. The algorithms rely on the multi-secret variant of RLWE, which allows to represent multiple ciphertexts more compactly. We give algorithms to convert from usual shared-secret RLWE ciphertexts to multi-secret ciphertexts and back. Further, we show that this format is compatible with homomorphic addition, plaintext-ciphertext multiplication, and key-switching. This in turn allows us to accelerate the slots-to-coeffs and coeffs-to-slots steps of CKKS bootstrapping when several ciphertexts are bootstrapped at once. Combining batch-bootstrapping with efficient PC-MM results in MaMBo (Matrix Multiplication Bootstrapping), a bootstrapping algorithm that can perform a PC-MM for a limited overhead.

  4. Crypto 2023

    HERMES: Efficient Ring Packing using MLWE Ciphertexts and Application to Transciphering

    with Youngjin Bae, Jung Hee Cheon, Jaehyung Kim, and Damien Stehlé

    Most of the current fully homomorphic encryption (FHE) schemes are based on either the learning-with-errors (LWE) problem or on its ring variant (RLWE) for storing plaintexts. During the homomorphic computation of FHE schemes, RLWE formats provide high throughput when considering several messages, and LWE formats provide a low latency when there are only a few messages. Efficient conversion can bridge the advantages of each format. However, converting LWE formats into RLWE format, which is called ring packing, has been a challenging problem. We propose an efficient solution for ring packing for FHE. The main improvement of this work is twofold. First, we accelerate the existing ring packing methods by using bootstrapping and ring switching techniques, achieving practical runtimes. Second, we propose a new method for efficient ring packing, HERMES, by using ciphertexts in Module-LWE (MLWE) formats, to also reduce the memory. To this end, we generalize the tools of LWE and RLWE formats for MLWE formats. On a single-thread implementation, HERMES consumes 10.2s for the ring packing of 32768 LWE-format ciphertexts into an RLWE-format ciphertext. This gives 41x higher throughput compared to the state-of-the-art ring packing for FHE, PEGASUS [S&P'21], which takes 51.7s for packing 4096 LWE ciphertexts with similar homomorphic capacity. We also illustrate the efficiency of HERMES by using it for transciphering from LWE symmetric encryption to CKKS fully homomorphic encryption, significantly outperforming the recent proposals HERA [Asiacrypt'21] and Rubato [Eurocrypt'22].

  5. IEEE TIFS 2022

    Efficient Homomorphic Evaluation on Large Intervals

    with Jung Hee Cheon and Wootae Kim

    Homomorphic encryption (HE) is being widely used for privacy-preserving computation. Since HE schemes only support polynomial operations, it is prevalent to use polynomial approximations of non-polynomial functions. We cannot monitor the intermediate values during the homomorphic evaluation; as a consequence, we should utilize polynomial approximations with sufficiently large approximation intervals to prevent the failure of the evaluation. However, the large approximation interval potentially accompanies computational overheads, and it is a serious bottleneck of HE application on real-world data. In this work, we introduce domain extension polynomials (DEPs) that extend the domain interval of functions by a factor of k while preserving the feature of the original function on its original domain interval. By repeatedly iterating the domain-extension process with DEPs, we can extend with O(log K) operations the domain of a given function by a factor of K while the feature of the original function is preserved in its original domain interval. By using DEPs, we can efficiently evaluate in an encrypted state a function that converges at infinities. To uniformly approximate the function on [-R,R], our method exploits O(log R) operations and O(1) memory. This is more efficient than the previous approach, the minimax approximation and Paterson-Stockmeyer algorithm, which uses Omega(sqrt(R)) multiplications and memory for the evaluation. As another application of DEPs, we also suggest a method to manage the risky outliers from a large interval [-R,R] by using O(log R) additional multiplications. As a real-world application, we trained the logistic regression classifier on large public datasets in an encrypted state by using our method. We exploit our method to the evaluation of the logistic function on large intervals.

For the complete list of my publications, please click here.

A random snapshot