IQ Mini 4489 Probabilities Report

Introduction

IQ Mini 4489

Game

x	x	o	y	y
x	x	x	x	y
o	x	z	z	y
z	x	z	o	y
z	z	z	y	y

It’s a 5x5 grid of holes, connected as shown in the diagram. Each colored path contains one peg which can be moved around to any other square with the same color. The point of the game is to fit the 6 uniquely shaped pieces in the puzzle such that all of them fit (and subsequently fill up the entire matrix).

Pieces

Very similar to tetris:

Tetris

The pieces are as follows:

3-in-a-row

Effectively 2 different ways to place it. It’s just 3 consecutive squares.

L shape

There are 4 different ways to place it. It’s a 3-in-a-row with an extra square attached to one of the ends. You can also use the third dimension and flip it, leading to 8 different ways to place it.

T shape

There are 4 different ways to place it. It’s a 3-in-a-row with an extra square attached to the middle.

Square

1 way to place it. It’s a 2x2 square.

Smaller L shape

There are 4 different ways to place it. It’s a 2-in-a-row with an extra square attached to one of the ends.

2 lines offset by 1

4 different ways.

Optimization

Bitmasking
Pre-computing all possible positions of the pieces
- Themselves
- On the (5x5) board
Skip recomputation of forbidden pins by flipping the matrix around (sort of)
Store partial solutions (very hard)

Probabilities

Data

Loading Data and libs

library(jsonlite)
library(dplyr)
library(tidyr)
library(ggplot2)
library(plotly)

data <- fromJSON("../src/data_extended.json")

Data Exploration

head(data)

##   pinA pinB pinC solution_count
## 1 0, 0 0, 4 4, 0            126
## 2 0, 0 0, 4 4, 4            126
## 3 0, 0 4, 0 0, 4            126
## 4 0, 0 4, 0 4, 4            126
## 5 0, 0 4, 4 0, 4            126
## 6 0, 0 4, 4 4, 0            126

Each record contains three forbidden pins (pinA, pinB, pinC) and a solution_count. Important: - The first value in each tuple is the y-coordinate (row), and the second value is the x-coordinate (column). - solution_count indicates how many unique puzzle solutions exist when pins are placed in the specified positions.

Data Preparation

Flattening Pin Positions

Since each pin is represented by a two-element vector ([y, x]), we separate these into individual columns:

data <- data %>%
  rowwise() %>%
  mutate(
    pinA_y = pinA[1],
    pinA_x = pinA[2],
    pinB_y = pinB[1],
    pinB_x = pinB[2],
    pinC_y = pinC[1],
    pinC_x = pinC[2]
  ) %>%
  ungroup() %>%
  select(-pinA, -pinB, -pinC)

We now have six columns representing the coordinates of each forbidden pin: - pinA_y, pinA_x - pinB_y, pinB_x - pinC_y, pinC_x

Game Context

The IQ Mini 4489 puzzle is played on a \(5\times 5\) grid, but the “tracks” for pin placement are defined in the table provided above where:

Orange signifies pinA’s range
Green signifies pinB’s range
Purple signifies pinC’s range

We will compare two scenarios:

Scenario A: Pins (pinA, pinB, pinC) can be placed anywhere on the \(5\times 5\) grid.
Scenario B: Pins are restricted to their specific tracks as defined in the table.

Exploratory Data Analysis

Distribution of Solution Counts

To visualize how often each solution count occurs, we use a histogram:

ggplot(data, aes(x = solution_count)) +
  geom_histogram(binwidth = 5, fill = "blue", color = "black") +
  theme_minimal() +
  labs(
    title = "Distribution of Solution Counts",
    x = "Solution Count",
    y = "Frequency"
  )

Forbidden Pin Influence on Solution Counts

We examine how the y-coordinates of each pin influence the solution_count. Below is an example focusing on pinA_y:

Seems like they are all pretty much the same.

Scenario Analysis

Scenario A: Pins Can Be Placed Anywhere

Under this assumption, all grid squares are valid for pin placement.

Observations:
- Certain positions may drastically lower the number of solutions, especially if they block puzzle pieces.
- For example, placing a forbidden pin in the center might have a different impact compared to placing it in a corner.
Pairwise Heatmap (Example with the mean solution_count of pinA_x vs. pinB_x)

We can see that the solution count is generally higher when the pins are further apart and that my observations were correct.

Scenario B: Pins Must Follow Their Tracks

Here, each pin must stay on its designated track. Note that we’re using aggregated data for this analysis. (i.e., we’re taking the mean of solution_count for each unique pin configuration).

Defining Valid Positions

First, we define the allowed positions for each pin based on the game context:

allowed_pinA <- tibble(
  pinA_y = c(0, 0, 1, 1, 1, 1, 2, 3), # Orange track
  pinA_x = c(0, 1, 0, 1, 2, 3, 1, 1)
)

allowed_pinB <- tibble(
  pinB_y = c(0, 0, 1, 2, 3, 4, 4), # Green track
  pinB_x = c(3, 4, 4, 4, 4, 3, 4)
)

allowed_pinC <- tibble(
  pinC_y = c(2, 2, 3, 3, 4, 4, 4), # Purple track
  pinC_x = c(2, 3, 0, 2, 0, 1, 2)
)

Filtering Data for Scenario B

Next, we filter the dataset to include only the configurations where each pin is placed on its respective track:

valid_scenarioB <- data %>%
  semi_join(allowed_pinA, by = c("pinA_y", "pinA_x")) %>%
  semi_join(allowed_pinB, by = c("pinB_y", "pinB_x")) %>%
  semi_join(allowed_pinC, by = c("pinC_y", "pinC_x"))

Analyzing Scenario B

Once filtered, we can perform similar analyses as in Scenario A:

Distribution of Solution Counts:
Pairwise Heatmap (Example with pinA_x vs. pinB_x):

_{Wow, that’s an interesting one!}

The observation here is that it is not similar to the previous one whatsoever. Once pinA_x is placed at 3, the solution count is generally lower, regardless of the pinB_x position. A hypothesis as to why this is the case is that the pinA track is close to the edge of the board, which might limit the movement of puzzle pieces.

That hypothesis crumbles, however, as once pinB_x is placed at 3.5, specifically when pinA_x \(< 3\), the solution count is generally higher. This suggests that the pinB track might have a more significant impact on the puzzle’s solvability.

Considering this interesting development, let’s look at the other pairwise heatmaps to see if there are any similar patterns.

Observing the above heatmaps, more interesting patterns emerge. Seems like it’s definitive that if pinA_x is placed at 3, the solution count is generally lower, regardless of the other pins’ positions. This is a stark contrast to the scenario where pins can be placed anywhere, where the solution count was generally higher when the pins were further apart.

The highest solution counts are observed when pinA_x is placed at 0 or 1, and pinB_x is placed at 3 or 4. This suggests that the puzzle is more solvable when the pinA and pinB tracks are further apart on the x axis.

It’s also curious that when pinA_x is placed at 3 and pinC_x \(\in [1,2]\) we get the lowest solution counts of \(10\).

Let’s wrap this up with a final comparison of all three pins’ positions:

3D Visualization

We can use a 3D scatter plot to visualize how pin positions affect solution_count for both scenarios. These are a bit confusing, but they’re fun to look at.

Scenario A (Pins Anywhere):
Scenario B (Pins on Tracks):

Probability Analysis

We categorize solution_count into three levels:

High (upper 25%)
Medium (50%, average)
Low (lower 25%)

based on the 25th and 75th percentiles, and then compute probabilities for both Scenario A and Scenario B.

Defining Categories

high_threshold_A <- quantile(data$solution_count, 0.75)
low_threshold_A <- quantile(data$solution_count, 0.25)

data <- data %>%
  mutate(solution_category_A = case_when(
    solution_count >= high_threshold_A ~ "High",
    solution_count <= low_threshold_A ~ "Low",
    TRUE ~ "Medium"
  ))

high_threshold_B <- quantile(valid_scenarioB$solution_count, 0.75)
low_threshold_B <- quantile(valid_scenarioB$solution_count, 0.25)

valid_scenarioB <- valid_scenarioB %>%
  mutate(solution_category_B = case_when(
    solution_count >= high_threshold_B ~ "High",
    solution_count <= low_threshold_B ~ "Low",
    TRUE ~ "Medium"
  ))

Calculating Probabilities

Scenario A Probability Analysis

probabilities_A <- data %>%
  group_by(solution_category_A) %>%
  summarise(count = n(), .groups = 'drop') %>%
  mutate(probability = count / sum(count))
probabilities_A

## # A tibble: 3 × 3
##   solution_category_A count probability
##   <chr>               <int>       <dbl>
## 1 High                 3156       0.251
## 2 Low                  3456       0.275
## 3 Medium               5976       0.475

Scenario B Probability Analysis

probabilities_B <- valid_scenarioB %>%
  group_by(solution_category_B) %>%
  summarise(count = n(), .groups = 'drop') %>%
  mutate(probability = count / sum(count))
probabilities_B

## # A tibble: 3 × 3
##   solution_category_B count probability
##   <chr>               <int>       <dbl>
## 1 High                   99       0.253
## 2 Low                   126       0.321
## 3 Medium                167       0.426

Visualizing Probabilities

Note that HIGH would mean “easy” in this case.

Conclusion

Scenario B’s constraints reduce the solvability of the puzzle compared to Scenario A, as evident in the increased proportion of Low solution counts. Clearly, the placement of pins has a significant impact on the puzzle’s solvability, more so positively in an unconstrained environment.