Existing Block Serialization Proposals?

BeksOmega · May 21, 2021, 11:31pm

Hello,

I know that sometimes the App Inventor team writes up really nice design proposals like this one which I based my previous GSoC project off of. Currently I'm doing some research into how forks/users of Blockly handle serialization and versioning, and I thought there might be some related proposals from you peoples that I should check out =)

I'm already familiar with the versioning.js file, and I absolutely love the system But I was wondering if anything has been written/tried wrt serializing the workspace to different formats (eg JSON, CBOR, etc), or compressing the save in other ways.

Thank you for your time!
--Beka

ewpatton · May 21, 2021, 11:40pm

Hi @BeksOmega

I'm not sure we have any other formal proposals at this time, but previously I have written a system to serialize to protobuf and more recently I have been thinking about writing a serialization mechanism using MessagePack. I mentioned doing the latter at a high level during the Blockly summit but haven't don't anything concrete on it yet.

Cheers,
Evan

ewpatton · May 21, 2021, 11:42pm

I should add that for really large block workspaces the protobuf serialization often resulted in 2 orders of magnitude smaller storage, but for some reason really large workspaces called the protobuf deserialize to throw an unexpected error so we never tried to productionalize it.

BeksOmega · May 22, 2021, 12:43am

Thanks for the info @ewpatton! I've looked into protobuf and I think it would be really cool to support that. Two orders of magnitude smaller sounds amazing haha. I'll have to look into MessagePack too =) I'd seen it in my research, but I didn't think it was very popular. I'll definitely check it out!

Thank you again for the info, that's really helpful
--Beka

ewpatton · May 24, 2021, 5:48pm

Here's a copy of the workspace.proto file I wrote for my tests:

syntax = "proto3";

message Field {
  string name = 1;
  string value = 2;
  string id = 3;
  string variableType = 4;
}

message Variable {
  string id = 1;
  string type = 2;
}

message ShadowableBlock {
  Block shadow = 1;
  Block block = 2;
}

message Input {
  string name = 1;
  enum InputType {
    UNKNOWN = 0;
    VALUE = 1;
    STATEMENT = 2;
    STACK = 3;
  }
  InputType type = 2;
  ShadowableBlock connection = 3;
  //repeated Block stack = 4;
}

message Comment {
  string id = 1;
  int64 x = 2;
  int64 y = 3;
  int64 h = 4;
  int64 w = 5;
  bool pinned = 6;
}

message Block {
  string id = 1;
  string type = 2;
  map<string, string> mutations = 3;
  repeated Field fields = 4;
  string comment = 5;
  string data = 6;
  repeated Input inputs = 7;
  ShadowableBlock next = 8;
  bool inline = 9;
  bool collapsed = 10;
  bool disabled = 11;
  bool deletable = 12;
  bool movable = 13;
  bool editable = 14;
  int64 x = 15;
  int64 y = 16;
}

message Workspace {
  repeated Variable variables = 1;
  repeated Comment comments = 2;
  repeated Block blocks = 3;
  int32 young_android_language = 4;
  int32 blocks_language = 5;
}

Note that it makes certain assumptions about the blocks language that are true for App Inventor but might not be true for an arbitrary Blockly workspace.

BeksOmega · May 24, 2021, 9:49pm

Thank you so much for digging that up Evan! Seeing an example really makes the idea more concrete I like the ShadowableBlock idea, that's a compact way to handle a connection that can hold 0-1 shadow blocks and 0-1 normal blocks.

My initial thought was to use repeated instead:

message Input {
  // etc...
  repeated Block connection = 3;
}

message Block {
  string id = 1;
  // etc...
  bool shadow = 8;
  bool inline = 9;
  bool collapsed = 10;
  // etc...
}

But I like yours better

Thank you again!
--Beka

ewpatton · May 25, 2021, 9:26pm

I don't quite recall the reason for designing that way, but if I had to guess it was probably because if we switch to using shadow blocks for things we likely don't ever want to store the shadow block for space reasons. If someone manipulates a shadow block we would want to promote it to a real block and have shadows as constructs of the prototype to save on storage space. Remember we pay for every byte sent by the App Inventor server to the client and at the number of users we have even adding the tag <shadow></shadow> to a single workspace will add somewhere around 12 cents per month to our bill (then think about multiplied by a shadow block per input).

BeksOmega · July 13, 2021, 2:03pm

Update on this! I've created a design doc outlining our plan to add a plain JavaScript Object serialization method to Blockly (~15 minute read, not include appendices)

For anyone who's interested, thoughts are very much appreciated =) You can leave comments here, in the doc, or message me directly!