The thread table should have a constraint to disallow posting the same thread:
CONSTRAINT unique_board_board_thread_id_constraint UNIQUE (board_id, board_thread_id)
so the pair of values (board_id, board_thread_id) will never be the same for two rows in this table.
We can index some of the columns in the threads table:
CREATE TABLE IF NOT EXISTS threads
( thread_id bigserial primary key
, board_thread_id bigint NOT NULL -- this is the id of the thread in lainchan, mysql
, creation_time timestamp with time zone NOT NULL
, board_id int NOT NULL
, CONSTRAINT board_fk FOREIGN KEY (board_id) REFERENCES boards (board_id) ON DELETE CASCADE
, CONSTRAINT unique_board_board_thread_id_constraint UNIQUE (board_id, board_thread_id)
);
CREATE INDEX threads_creation_time_idx ON threads (creation_time);
CREATE INDEX threads_board_id_idx ON threads (board_id);
CREATE INDEX threads_board_thread_id_idx ON threads (board_thread_id);
A similar thing needs to be done with posts, if we scrape the website we shouldn't
accidentally archive the same post that has the same id in lainchan:
CREATE TABLE IF NOT EXISTS posts
( post_id bigserial primary key
, board_post_id bigint NOT NULL
, creation_time timestamp with time zone NOT NULL
, body text
, body_search_index tsvector
, thread_id bigint NOT NULL
, CONSTRAINT unique_thread_board_id_constraint UNIQUE (thread_id, board_post_id)
, CONSTRAINT thread_fk FOREIGN KEY (thread_id) REFERENCES threads (thread_id) ON DELETE CASCADE
);
CREATE INDEX posts_creation_time_idx ON posts (creation_time);
CREATE INDEX posts_body_search_idx ON posts USING GIN (body_search_index);
CREATE INDEX posts_thread_id_idx ON posts (thread_id);
CREATE INDEX posts_board_post_id_idx ON posts (board_post_id);
Post too long. Click here to view the full text.