RFC: Web component and dictionaries

Hello everyone,

We’re getting pretty close to finished with our implementation of dictionaries for App Inventor. With this new datatype though we have some opportunities to revisit some design decisions and think about how to make use of this new datatype in App Inventor. Mainly, I am thinking about the Web component’s JsonDecode and XMLDecode blocks. At present, these blocks return an associative list (alist) when they encounter a dictionary-like object. Of course, we could make it so that these methods instead return Dictionary objects rather than Lists. This has the benefit of improved app performance (dictionaries are amortized constant speed lookup versus worst case linear time in alists), and since dictionaries render as JSON they make interoperating with JSON services easier. The dictionary blocks also include some advanced blocks in the form of the get/set using a key path. Dictionaries and alists are also interchangable at the block level. For example, the lookup in pairs (list) block will accept either lists or dictionaries, and similarly the get blocks in the dictionaries drawer will work with alists.

Here are the possible options I’ve thought about, and I encourage any additions or suggestions:

Option 1.
Make the JsonDecode/XMLDecode blocks return dictionaries instead of alists.

This would be the easiest change since it wouldn’t require any new blocks added to the language and we wouldn’t need to bump the Web component’s version number. These methods already return Object, so no API level change is required.

Option 2.
Rename the existing functions to ...AsLists and introduce new methods by the same name that return dictionaries.

This option would have the existing versions of the methods renamed to indicate that they will return lists. We would introduce new methods under the same name guaranteed to return dictionaries like in option 1. This allows people who would prefer the alist behavior to use the original functionality whereas most people will probably benefit from the new method (same as option 1).

Option 3.
Add a parameter to the methods to control whether dictionaries or alists are returned.

This allows us to not introduce new blocks, but still requires an API change and is not backward compatible with older companions that will have the old method signature.

Option 4.
Add a property on the Web component that controls the type.

This allows us to not change any of the method signatures (like option 1), but does require a bump in version due to the addition of a new property. One possible downside of decoupling the control over the behavior from the method is people might not realize that this option is available or what it’s for. It’s also set for every decode call by the web component, which may or may not be desired (unlike option 2 where it’s on a per-invocation basis). The benefit is that it allows the user to choose the parsing behavior similar to option 2.

Personally, I am for option 1 or 2, with a slight lean toward option 2 for people who might really need backward compatibility. This is also motivated by a desire to redesign the structure returned by XMLDecode, which will be a breaking change (and therefore why option 2 is preferable).

Regards,
Evan

12 Likes

Hi Evan,
It is truly nice these new developments! I have a few questions though.

  • Can we have more information about what dictionaries will look like in AI2 and how to address them?
  • I think it would be very confusing if the XMLTextDecode and JSNTextDecode all of a sudden would have a different meaning and return a different structure. Is it really a problem to have new blocks that could return dictionaries?
  • Would the options as presented above break any old app’s?
  • Would the new dictionary results for XMLTextDecode not have the problems that the current list structure has: it can not handle CDATA and in some cases the lists are flattened (when there is only one instance of a certain XML tag), making the result really hard to handle?
  • I would prefer option 5, where there would be new blocks.

One related remark: there is also HTMLTextDecode, which does something completely different. It is a useful block, but the name is really confusing.

Cheers, Ghica.

Option 2 sounds okay by me, with no changes needed to old apps.

I have experimented with attempts at expressing key paths in lists,
and have encountered ambiguity when I encounter JSON that has
numerical tags. A list of path choices from the root gets ambiguous
when I encounter a value like “3” on a path and can’t tell if it means the
third element of an array or the lookup of the value for tag “3”.

How would such a path be expressed?

In response to my question about paths, I imagine a numerical path item
would be interpreted as an array subscript if the branch it addressed were a list, and
as a lookup tag if the branch currently addressed was a dictionary.

In light of that, I suppose it would be presumptuous to impose a requirement of all dictionaries
or all lists on a JSONDecode operation. The JSON markup ([] vs []) should determine the
resultant type for each sub-branch of a JSON decoded string, right?

Option 2.
Plus concise working examples in the Help.

First of all, thank you for developing a new solution, the web component and json was always a hassle, but in the end it worked.

Regarding the question, I am in favour of option 1. However, if you decide that you need backwards compatibility as it is a breaking change, I would prefer option 4. If I understand correctly, option 1 would break some apps as the decode functions change behaviour

Here's the complete list of blocks. I'm still working on documentation, etc. but hope to have something for you to all try by the weekend.

Only option 1 changes the semantics, and we've made it so that dictionaries can be used in place of associative lists (and vice versa), so in theory there shouldn't be any observable difference other than the fact that is a list? will return false. This is the primary reason against option 1.

They should not, with the exception of option 1. I'm writing a whole bunch of tests to give a broad range of examples.

I'm planning to submit this as a separate change. This is one of the reasons for choosing Option 2 because people who have apps that rely on the old behavior could use the old block and people who would prefer a more well-defined behavior using dictionaries could use the new blocks. I have a separate document that details my fun with XMLTextDecode that I'll be sharing related to that proposed change.

I believe option 5 you propose is the same as option 2, or at least a slight variant thereof where we keep the existing names and have new blocks for the versions that return dictionaries. I figure that most people will prefer the dictionary behavior, in which case it would make more sense for the "main" blocks to do the dictionary behavior with the renamed blocks providing the old (current) behavior.

1 Like

For what it’s worth, here is a sample JSON browser that can navigate
objects and lists, with sample mixed weather data in a Media File …


browse_JSON_V2.aia (7.9 KB)

It will be interesting to see how the weather JSON looks after decoding.

1 Like

Thanks for the example @ABG. Unfortunately, it looks like the site you were scraping has changed its API a bit. You might want to check up on that.

@Red_Panda I’m curious what your motivation for Option 4 over Option 2 would be for the backward-compatibility scenario. Option 2 would end up with 4 blocks JsonTextDecode, XMLTextDecode, JsonTextDecodeAsLists, and XMLTextDecodeAsLists, with the latter two providing the same behavior as JsonTextDecode and XMLTextDecode do today (or Ghica’s proposed Option 5 where the current methods stay the same and we add dictionary variants). In my opinion the real downside of Option 4 is that the property version decouples the semantics of the return value from the block by putting it behind the property. This is fine for expert users but might be missed by new users. It also means that code snippets will be dependent on the property that may not be captured in screenshots, for example.

Yes that's correct. JSON objects ({..}) will decode as dictionaries and JSON arrays ([...]) decode as lists. This is actually quite useful because at the moment these two documents will decode to the same representation in App Inventor:

{
  "a": [["b", "c"]]
}
{
  "a": {"b": "c"}
}

even though at the JSON level they have different encodings and that information is currently lost when it's brought into App Inventor (both encode as ((a ((b c))))).

1 Like

My example has a Media .txt file in it with sample output,
so it can still serve as a test bed (with modification for the
expanded type system) using its built in data.

I was going to include the .txt file as an attachment, but this board
disallowed it for not being on the approved attachment type list. :roll_eyes:

My only motivation is that I want to keep things clean in the blocks list and avoid redundancy. I must admit that the screenshot problem must be considered. I would then ultimately choose option 5 as it avoids confusion and maintains backwards compatibility. When I chose option 4, I only had advanced users in mind.

Here’s a sample JSON sample to show how important it is to
sense the type (list vs dictionary) of a branch in decoded JSON.
The tags for the currency conversions are numeric,
possibly misleading a path navigator into thinking they
are array indices.
(I would have attached it as a file, but .json is also not on the approved attachment type list.)

The path to the symbol for Ethereum (ETH) is
“data”, “1027”, “symbol”.

The 1027 would be a pitfall for some one following the path if they did not know it was a tag, not an item number subscript.

{
    "data": {
        "1": {
            "id": 1, 
            "name": "Bitcoin", 
            "symbol": "BTC", 
            "website_slug": "bitcoin", 
            "rank": 1, 
            "circulating_supply": 17040687.0, 
            "total_supply": 17040687.0, 
            "max_supply": 21000000.0, 
            "quotes": {
                "USD": {
                    "price": 8125.03, 
                    "volume_24h": 6127350000.0, 
                    "market_cap": 138456093096.0, 
                    "percent_change_1h": 0.36, 
                    "percent_change_24h": -2.3, 
                    "percent_change_7d": -5.64
                }
            }, 
            "last_updated": 1526660372
        }, 
        "1027": {
            "id": 1027, 
            "name": "Ethereum", 
            "symbol": "ETH", 
            "website_slug": "ethereum", 
            "rank": 2, 
            "circulating_supply": 99514610.0, 
            "total_supply": 99514610.0, 
            "max_supply": null, 
            "quotes": {
                "USD": {
                    "price": 678.142, 
                    "volume_24h": 2461300000.0, 
                    "market_cap": 67485036718.0, 
                    "percent_change_1h": 0.31, 
                    "percent_change_24h": -3.57, 
                    "percent_change_7d": -0.95
                }
            }, 
            "last_updated": 1526660359
        }, 
        "52": {
            "id": 52, 
            "name": "Ripple", 
            "symbol": "XRP", 
            "website_slug": "ripple", 
            "rank": 3, 
            "circulating_supply": 39189968239.0, 
            "total_supply": 99992233977.0, 
            "max_supply": 100000000000.0, 
            "quotes": {
                "USD": {
                    "price": 0.666191, 
                    "volume_24h": 365651000.0, 
                    "market_cap": 26108004131.0, 
                    "percent_change_1h": 0.54, 
                    "percent_change_24h": -3.82, 
                    "percent_change_7d": -3.49
                }
            }, 
            "last_updated": 1526660342
        }, 
        "1831": {
            "id": 1831, 
            "name": "Bitcoin Cash", 
            "symbol": "BCH", 
            "website_slug": "bitcoin-cash", 
            "rank": 4, 
            "circulating_supply": 17134675.0, 
            "total_supply": 17134675.0, 
            "max_supply": 21000000.0, 
            "quotes": {
                "USD": {
                    "price": 1159.18, 
                    "volume_24h": 912848000.0, 
                    "market_cap": 19862172567.0, 
                    "percent_change_1h": 0.59, 
                    "percent_change_24h": -8.94, 
                    "percent_change_7d": -16.93
                }
            }, 
            "last_updated": 1526660353
        }, 
        "1765": {
            "id": 1765, 
            "name": "EOS", 
            "symbol": "EOS", 
            "website_slug": "eos", 
            "rank": 5, 
            "circulating_supply": 864651768.0, 
            "total_supply": 900000000.0, 
            "max_supply": 1000000000.0, 
            "quotes": {
                "USD": {
                    "price": 12.4424, 
                    "volume_24h": 1433810000.0, 
                    "market_cap": 10758343155.0, 
                    "percent_change_1h": 0.41, 
                    "percent_change_24h": -5.45, 
                    "percent_change_7d": -17.69
                }
            }, 
            "last_updated": 1526660351
        }, 
        "2": {
            "id": 2, 
            "name": "Litecoin", 
            "symbol": "LTC", 
            "website_slug": "litecoin", 
            "rank": 6, 
            "circulating_supply": 56588813.0, 
            "total_supply": 56588813.0, 
            "max_supply": 84000000.0, 
            "quotes": {
                "USD": {
                    "price": 132.64, 
                    "volume_24h": 365957000.0, 
                    "market_cap": 7505940110.0, 
                    "percent_change_1h": 0.33, 
                    "percent_change_24h": -4.48, 
                    "percent_change_7d": -5.16
                }
            }, 
            "last_updated": 1526660359
        }, 
        "2010": {
            "id": 2010, 
            "name": "Cardano", 
            "symbol": "ADA", 
            "website_slug": "cardano", 
            "rank": 7, 
            "circulating_supply": 25927070538.0, 
            "total_supply": 31112483745.0, 
            "max_supply": 45000000000.0, 
            "quotes": {
                "USD": {
                    "price": 0.239455, 
                    "volume_24h": 83300900.0, 
                    "market_cap": 6208366676.0, 
                    "percent_change_1h": 0.78, 
                    "percent_change_24h": -5.4, 
                    "percent_change_7d": -10.01
                }
            }, 
            "last_updated": 1526660355
        }, 
        "512": {
            "id": 512, 
            "name": "Stellar", 
            "symbol": "XLM", 
            "website_slug": "stellar", 
            "rank": 8, 
            "circulating_supply": 18577009453.0, 
            "total_supply": 103946502380.0, 
            "max_supply": null, 
            "quotes": {
                "USD": {
                    "price": 0.308398, 
                    "volume_24h": 34854200.0, 
                    "market_cap": 5729112561.0, 
                    "percent_change_1h": 0.4, 
                    "percent_change_24h": -6.81, 
                    "percent_change_7d": -2.69
                }
            }, 
            "last_updated": 1526660350
        }, 
        "1720": {
            "id": 1720, 
            "name": "IOTA", 
            "symbol": "MIOTA", 
            "website_slug": "iota", 
            "rank": 9, 
            "circulating_supply": 2779530283.0, 
            "total_supply": 2779530283.0, 
            "max_supply": 2779530283.0, 
            "quotes": {
                "USD": {
                    "price": 1.71571, 
                    "volume_24h": 75040100.0, 
                    "market_cap": 4768867902.0, 
                    "percent_change_1h": 0.51, 
                    "percent_change_24h": -7.77, 
                    "percent_change_7d": -8.93
                }
            }, 
            "last_updated": 1526660351
        }, 
        "1958": {
            "id": 1958, 
            "name": "TRON", 
            "symbol": "TRX", 
            "website_slug": "tron", 
            "rank": 10, 
            "circulating_supply": 65748111645.0, 
            "total_supply": 100000000000.0, 
            "max_supply": null, 
            "quotes": {
                "USD": {
                    "price": 0.0675568, 
                    "volume_24h": 271974000.0, 
                    "market_cap": 4441732029.0, 
                    "percent_change_1h": 0.98, 
                    "percent_change_24h": -2.72, 
                    "percent_change_7d": 0.25
                }
            }, 
            "last_updated": 1526660354
        }
    }, 
    "metadata": {
        "timestamp": 1526660398, 
        "num_cryptocurrencies": 1593, 
        "error": null
    }
}

format of JSON example edited by Taifun

I've fixed this (20 chars).

1 Like

I find it hard to believe that you are providing shiny new functionality: dictionaries, that everyone will prefer over lists of pairs, and then are hiding it in old flawed functionality! Therefore I still prefer option 5.

Two more questions: are the dictionaries now making a difference between attributes and nested tags for XML and if yes, how? Second, is it guaranteed that the decoded XML will be a dictionary at every level?
Cheers, Ghica.

1 Like

I say, we can better deprecate those blocks from Web component, and have “parse JSON”, “stringify JSON”, “parse XML” & “stringify XML” blocks in dictionary category. Then it would be so cool and simple :fire::fire::fire:

2 Likes

While I generally like the idea, there are potentially a few issues we need to think about. We will take it under advisement.

  1. When dictionaries are serialized to strings, they serialize as JSON. Thus a “dictionary to json string” is redundant.

  2. When decoding a JSON document, the root can be any of null, a number, a string, a boolean, an array, or an object. This means that the return type isn’t necessarily known at design time and so to place the decode block in one category over another is just preferential (for example, why not put it in the lists drawer?). By keeping it in the component we don’t tie it to an expected return type.

  3. I’m not so sure whether we want to support stringifying to XML or not, mainly because the new structure I’m imagining is a bit more complex than the current representation to fix some problems that @Ghica encountered. Constructing and maintaining the data structure will be non-trivial. I’m willing to say that App Inventor should be opinionated here and prefer JSON to XML wherever it can (although I can also make the counterargument to support XML for web services that require it, though they might be served better by extensions).

Separately, I’ve been thinking about a csv table to dictionary block that could parse the table such that it returns a list of dictionaries such that each dictionary represents a row in the table and the keys of the dictionary are the names of the columns. That isn’t in the version we currently have implemented though.

Generally, the return type of the JSON is known beforehand when they consume a Rest API or so. If the return type is not known, then the user-dev might want to just display the JSON string as text in a Label or write it to a file (which gets serialised to string as you mentioned). Also, there's is text?, is number?, is boolean?, is a dictionary? to detect the type.

So, I don't think there would be any harm in placing those blocks in dictionary category.

I didn't put much thought on that so there may be use cases which I missed.

Here’s a link to the latest version of the dictionaries work. It is not backward compatible with the previous versions due to a number of changes in the blocks. This version also has an experimental version of the Web component that implements option 2. Specifically, JsonTextDecode and XMLTextDecode now returns dictionaries. The old behaviors are available using JsonTextDecodeAsLists and XMLTextDecodeAsLists. The upgraders will rename the blocks if an older block is imported.

The XMLDecode block also provides a more robust return value compared to the earlier version. My reasoning for the new data structure format is described here:

I’m still in the process of setting up the buildserver, so you won’t be able to build apps with it just yet, but I am interested in how the companion behaves and how the new dictionaries model works for people.

Note that this version keeps the parse logic within the Web component.