Skip to content

Matching object IDs between environment graph and instance segmentation #153

Description

@Yuchen0112

Hi, I'm trying to overlay the object IDs in a scene image using the instance segmentation seg_inst and the environment graph comm.environment_graph() generated by the simulator. The scene image and the segmentation mask are shown as follows:

Image

Image

For example, the sofa node from the environment graph looks like this:
{ "id": 368, "category": "Furniture", "class_name": "sofa", "properties": ["SURFACES", "SITTABLE", "LIEABLE", "MOVABLE"], "states": []}
which has an ID of 368.

Here is the code I wrote for overlaying object IDs with the scene image rgb_file, instance segmentation seg_file, and the environment graph sg:

rgb_image = cv2.imread(rgb_file)
rgb_image = cv2.cvtColor(rgb_image, cv2.COLOR_BGR2RGB)
seg_mask = cv2.imread(seg_file, cv2.IMREAD_GRAYSCALE)

image_pil = Image.fromarray(rgb_image)
draw = ImageDraw.Draw(image_pil)
unique_labels = np.unique(seg_mask)

for label in unique_labels:
    if label == 0:  # Skip background
        continue
    obj_info = next((obj for obj in sg["nodes"]
                     if obj["id"] == int(label)), None)
    if obj_info is None:
        continue
    y_coords, x_coords = np.where(seg_mask == label)
    if len(x_coords) == 0 or len(y_coords) == 0:
        continue

    center_x = int(np.mean(x_coords))
    center_y = int(np.mean(y_coords))
    text = f"{label}"
    bbox = draw.textbbox((center_x, center_y), text, font=font)
    box_width = bbox[2] - bbox[0]
    box_height = bbox[3] - bbox[1]
    draw.rectangle([center_x - box_width//2 - 2,
                    center_y - box_height//2 - 2,
                    center_x + box_width//2 + 2,
                    center_y + box_height//2 + 2],
                   fill='white', outline='black')
    draw.text((center_x, center_y), text,
              fill='black', font=font, anchor="mm")

labeled_image = np.array(image_pil)
Image.fromarray(labeled_image).save("label.png")

The output is shown in the following image. However, the unique labels obtained from unique_labels = np.unique(seg_mask) do not match the object IDs from the environment graph. For example, the sofa is labeled with 40 instead of 368 (the id from the environment graph), see the center lower part of the image.

Image

I assume the values from the segmentation mask are randomly given, since the pixel value in a 3-channel RGB image normally varies from 0 to 255, but the values of the object IDs in the environment graph can be greater than 255. I'd like to ask how I can get the corresponding object IDs from the segmentation mask. Thanks very much!

P.S. The versions of VirtualHome and Unity simulator I use are both v2.3.0. The code is implemented in Python 3.10.15 and executed on Ubuntu 20.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions