Skip to content

Refactor XAccountController.archiveBuild to be more efficient #385

@micahflee

Description

@micahflee

While #384 should fix the problem where a specific JSON.stringify call was causing a user's computer to run out of memory because they had so much data, the underlying function is still pretty inefficient. It starts by selecting everything from the SQLite database and storing it in javascript constants in memory:

// Tweets
const tweets: XTweetRow[] = exec(
    this.db,
    "SELECT * FROM tweet WHERE text NOT LIKE ? AND username = ? ORDER BY createdAt DESC",
    ["RT @%", this.account.username],
    "all"
) as XTweetRow[];

// Retweets
const retweets: XTweetRow[] = exec(
    this.db,
    "SELECT * FROM tweet WHERE text LIKE ? ORDER BY createdAt DESC",
    ["RT @%"],
    "all"
) as XTweetRow[];

// Likes
const likes: XTweetRow[] = exec(
    this.db,
    "SELECT * FROM tweet WHERE isLiked = ? ORDER BY createdAt DESC",
    [1],
    "all"
) as XTweetRow[];

// Bookmarks
const bookmarks: XTweetRow[] = exec(
    this.db,
    "SELECT * FROM tweet WHERE isBookmarked = ? ORDER BY createdAt DESC",
    [1],
    "all"
) as XTweetRow[];

// Users
const users: XUserRow[] = exec(
    this.db,
    'SELECT * FROM user',
    [],
    "all"
) as XUserRow[];

// Conversations and messages
const conversations: XConversationRow[] = exec(
    this.db,
    'SELECT * FROM conversation ORDER BY sortTimestamp DESC',
    [],
    "all"
) as XConversationRow[];
const conversationParticipants: XConversationParticipantRow[] = exec(
    this.db,
    'SELECT * FROM conversation_participant',
    [],
    "all"
) as XConversationParticipantRow[];
const messages: XMessageRow[] = exec(
    this.db,
    'SELECT * FROM message ORDER BY createdAt',
    [],
    "all"
) as XMessageRow[];

If you're selecting more data than your computer has free memory, this part will run out of memory too.

Instead, this should be refactored to stream each type of data into the JSON file one at a time, clearing out the memory once it's no longer used.

Also, the exec function right now (defined in src/database/common.ts) runs the SQL query and loads all rows into an array in memory. It would probably make sense to also somehow make this SQL query return an iterator instead of an array, so we can loop through the results without needing to load all results at once.

I'd say this is low priority because premature efficiency is the root of all evil, but I could see some users with old computers and massive amounts of data hitting it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions