LeetCode Solutions Blog

Design a File Sharing System

Number: 1640

Difficulty: Medium

Paid? Yes

Companies: Twitch

Problem Description

A file-sharing system maintains a large file divided into m chunks (identified by IDs 1 through m). When a user joins, they are assigned the smallest available unique user ID (IDs of users that have left can be reused). Each user starts with a collection of owned chunks and, upon requesting a chunk, the system returns the sorted list of all active users that currently have that chunk. If the returned list is non-empty, the requester also obtains a copy of the requested chunk.

Key Insights

Use a min-heap to always assign the smallest available user ID.
Maintain a mapping from chunkID to a set of userIDs that own the chunk.
Maintain a mapping from userID to a set of chunkIDs the user owns; this helps when a user leaves.
On a request, if the chunk is available, add the requester to the set of owners.
Efficient operations are key since join, leave, and request are called frequently.

Space and Time Complexity

Time Complexity:

join: O(k log n) where k is the number of chunks provided and n is the number of users (due to heap operations).
leave: O(k) where k is the number of chunks owned by the user.
request: O(s log s) where s is the number of owners (mainly for sorting the result). Space Complexity:
O(m + total active user-chunk associations) for storing chunk-to-users and user-to-chunks mappings.

Solution

We use a min-heap to track and reuse available user IDs. Two mappings are maintained:

A dictionary (or equivalent) mapping each chunk ID to the set of user IDs that currently own it.
A mapping from user ID to the set of chunks they possess. During join, assign a user ID from the available pool (or a new one if none is available), and update both mappings. When a user requests a chunk, return the sorted list of current owners and add the chunk to the requester’s possession. When a user leaves, remove that user from all chunk sets and push their ID back into the min-heap.

Code Solutions

import heapq

class FileSharing:
    def __init__(self, m: int):
        # Total number of chunks in the file.
        self.m = m
        # Mapping from chunkID to a set of userIDs owning this chunk.
        self.chunk_to_users = {i: set() for i in range(1, m+1)}
        # Mapping from userID to a set of chunks the user owns.
        self.user_to_chunks = {}
        # Min-heap for available user IDs.
        self.available_ids = []
        # The next new user ID if no available ids exist.
        self.next_id = 1

    def join(self, ownedChunks: list) -> int:
        # Assign the smallest available user id.
        if self.available_ids:
            user_id = heapq.heappop(self.available_ids)
        else:
            user_id = self.next_id
            self.next_id += 1
        # Record the chunks that the user owns.
        self.user_to_chunks[user_id] = set(ownedChunks)
        # Update chunk ownership mapping.
        for chunk in ownedChunks:
            self.chunk_to_users[chunk].add(user_id)
        return user_id

    def leave(self, userID: int) -> None:
        # Remove the user from all chunks they own.
        owned_chunks = self.user_to_chunks.pop(userID, set())
        for chunk in owned_chunks:
            self.chunk_to_users[chunk].discard(userID)
        # Return the userID to the available pool.
        heapq.heappush(self.available_ids, userID)

    def request(self, userID: int, chunkID: int) -> list:
        # Retrieve current owners of the requested chunk.
        owners = self.chunk_to_users[chunkID]
        if not owners:
            return []
        # Prepare sorted list of owners.
        result = sorted(owners)
        # If the user does not already have this chunk, add it.
        if chunkID not in self.user_to_chunks[userID]:
            self.user_to_chunks[userID].add(chunkID)
            self.chunk_to_users[chunkID].add(userID)
        return result

# Example usage:
# obj = FileSharing(m)
# userId = obj.join(ownedChunks)
# obj.leave(userID)
# owners = obj.request(userID, chunkID)