Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Moderately improved dict printing #5706

Merged
merged 1 commit into from
Jun 2, 2014

Conversation

mbauman
Copy link
Member

@mbauman mbauman commented Feb 6, 2014

This is a modest attempt at improving the situation for issue #1759. I've added
slightly enhanced summarys with information about the number of k/v pairs for
Associative and Key/ValueIterator types.

I'm still slightly confused by all the different methods used for show, but
this PR keeps the old behavior (mostly) as showcompact. I then added new
functionality in show and showlimited, using newlines as delimiters between
key/value pairs. When output is limited, values are truncated at newlines or
the TTY screen edge and a limited number of pairs are printed.

The key and value iterators no longer spit out entire dictionaries when shown.
They instead show a limited {} array of the keys and values.

@JeffBezanson
Copy link
Member

Could you post some sample output? That will make it much easier for people to bikeshed this :)

@mbauman
Copy link
Member Author

mbauman commented Mar 8, 2014

Sure thing. It's pretty simplistic, but I think it's better than what we have now.

julia> d = {"key$(i)" => randn(i,i) for i = 1:100} # shows one key per line and truncates values
Dict{Any,Any} with 100 entries:
  "key11" => [1.42777 0.879199 -0.586857 -0.659849 -0.702172 -0.483055 1.54336…
  "key70" => [-0.140932 0.805937 0.717354 -0.210394 -1.02608 0.121015 -0.04214…
  "key72" => [0.0818324 1.5367 -0.637601 0.61147 0.0528507 0.225477 1.94511 -0…
  "key89" => [-1.15099 0.947135 -0.0314771 -0.473041 0.243769 -2.19659 -0.3702…
  "key65" => [-1.2739 1.72985 -0.189676 1.14118 0.127005 0.844568 -1.64292 -0.…
  "key45" => [0.432352 0.260041 -0.584482 -0.957978 1.12756 -1.45072 1.29183 0…
  "key67" => [-0.486363 0.0937845 -1.64004 -1.67119 0.549293 -0.711426 0.12537…
  "key41" => [0.220474 -1.46249 -0.679411 -0.018757 -2.02637 -0.397589 0.31227…
  "key63" => [-0.943828 0.165266 1.37679 0.853508 1.4507 -1.59887 1.77201 -0.2…
  "key32" => [-0.587719 -0.01194 1.37864 -1.22133 -1.55792 3.09753 0.475866 -1…
  "key42" => [0.236417 0.750655 -0.495845 1.13211 -0.294828 -0.709171 1.15787 …
  "key2" => [-0.54387 0.592814…
  "key64" => [0.116901 1.05483 -1.0676 0.250556 1.03735 -0.512102 0.849797 0.0…
  "key7" => [-0.338442 -0.478758 -0.299093 1.09469 -2.361 -0.0666913 0.519315…
  "key86" => [0.508162 0.126979 -1.58807 0.106776 -0.637767 2.26487 -1.15231 1…
  "key1" => [-0.906512]
  "key54" => [0.373752 0.516723 1.04254 0.223891 -0.264017 -0.185658 -0.975029…
  "key96" => [2.29026 -0.0467733 1.07456 0.345007 1.37166 0.970921 0.42396 0.3…
  "key8" => [-1.12283 -1.05456 -0.145667 1.87225 1.31385 -1.99974 0.842182 -1.…
  "key76" => [-0.706051 1.44636 1.77974 0.208748 -0.327202 -0.0452911 0.453084…
  "key100" => [0.350757 -0.519514 0.378488 1.61831 0.0235384 1.75818 1.11947 -…
  "key4" => [0.420041 -0.417906 -0.586962 1.13799…
  ⋮ => ⋮

julia> keys(d) # essentially punts to show/showlimited for Array{Any,1}
KeyIterator for a Dict{Any,Any} with 100 entries. Keys:
  {"key11","key70","key72","key89","key65","key45","key67","key41","key63","key32"  …  "key59","key73","key83","key25","key95","key30","key52","key35","key58","key6"}

julia> summary(d)
"Dict{Any,Any} with 100 entries"

idx < 1 || println(io)
print(" ")
key = reprlimited(k)
print(io, key)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just replace these two lines with showlimited(io, key)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, nevermind. I see that you use the length of key below.

@kmsquire
Copy link
Member

kmsquire commented Mar 9, 2014

+1. Seems like an improvement to me!

@mbauman
Copy link
Member Author

mbauman commented Mar 10, 2014

Especially compared to the status quo, where the above example generates over 82k lines worth of output (on a TTY with 80 columns). It may seem contrived, but this was inspired by work with very large and highly-nested data-structures coming from MAT.jl and PLX.jl.

@StefanKarpinski
Copy link
Member

It's not contrived at all. I suspect most of us have worked with exactly this sort of data.

@mbauman
Copy link
Member Author

mbauman commented Mar 10, 2014

I've updated the PR to use sprint instead of reinventing the wheel. It's functionally identical… I'm just slowly learning better practices.

As an aside, I've since hacked together a show-alternative that I specialized for a handful of datatypes. I found it immensely useful when digging into the MAT.jl format: see summarize in this IJulia notebook. It does some things that I probably wouldn't want in a general case (like printing array indices) and isn't quite accurate in some of its representations, but it is amazingly useful.

@JeffBezanson
Copy link
Member

I love xxd. Looks like a good candidate for the standard library.

@mbauman
Copy link
Member Author

mbauman commented Mar 18, 2014

PR updated to simply use show instead of showlimited when printing values - the output once again looks like the example I provided above when printed in the REPL (it had changed after 213b4b3). But it now also adds the nice feature that explicitly calling show will display everything.

@ihnorton
Copy link
Member

Bump. Having keys(...) dump everything is really annoying.

@timholy
Copy link
Member

timholy commented May 31, 2014

One-line-per-entry would be a lovely improvement. 👍 for this.

@mbauman
Copy link
Member Author

mbauman commented May 31, 2014

I can do this better now. I'll rework it.

@mbauman mbauman changed the title Moderately improved dict printing WIP: Moderately improved dict printing May 31, 2014
@mbauman
Copy link
Member Author

mbauman commented Jun 2, 2014

Unicode is hard. There were a whole slew of issues in my original implementation. But I think I have something now that is really awesome. This now has a much more Julian style, aligns the =>s and has at least a few tests, too. The example I posted above is now (with a very small terminal):

julia> d = {"key$(i)" => randn(i,i) for i = 1:10}
Dict{Any,Any} with 10 entries:
  "key6"  => [2.260137601709653 0.4532699818344903 2.2870672099482188 0.9816818…
  "key9"  => [0.5742554985982649 0.3355127474514725 0.13603341808213334 1.09958…
  "key10" => [-2.0669876353622514 -1.256875903344474 0.5563481797172773 -0.2080…
  "key2"  => [0.4160632908379072 0.11721273749843224…
  "key8"  => [-0.6246048294019402 -0.26452826097987675 0.741202984789746 1.1962…
  ⋮       => ⋮

julia> keys(d)
KeyIterator for a Dict{Any,Any} with 10 entries. Keys:
  "key6"
  "key9"
  "key10"
  "key2"
  ⋮

julia> values(d)
ValueIterator for a Dict{Any,Any} with 10 entries. Values:
  [2.260137601709653 0.4532699818344903 2.2870672099482188 0.9816818019880121 -…
  [0.5742554985982649 0.3355127474514725 0.13603341808213334 1.0995815440474772…
  [-2.0669876353622514 -1.256875903344474 0.5563481797172773 -0.208060864819070…
  [0.4160632908379072 0.11721273749843224…
  ⋮

Or with some unicode:

[CharString(rand('α':'ω',i)) => CharString(rand('α':'ω',i*2)) for i = 1:50]
Dict{UTF32String,UTF32String} with 50 entries:
  "οςφηςσξωηξφδβγδτπκαλλρ… => "πφψςξμοβχδσωεωσδεκβφνςφεοξφογκςδωςεεασμρθρυθιψςκ…
  "ςαπβαγχρωβηυξγφβ"       => "ρφςησεφυςπψευασνπιηχεοβρπωχυμκηφ"
  "λρβςαψ"                 => "γεβηυφθοωηωφ"
  "χβζοξπλμδλκθκφπφδξηηθκ… => "ρτχγεωζδλζχτγχφβδμωησοαξαζαιαξιιπξυφψβοθβκοξδψρμ…
  "φιζνπλυτβδφηωγθργεχκεα… => "σγζζβλκπμμυρχθχεψτυννμςζωξδθψγφαωσκσυθξρυωυτφλψη…
  ⋮                        => ⋮

Or the absolutely cruel:

julia> [utf8(char(rand(255:2000,i))) => utf8(char(rand(255:2000,i*5))) for i = 1:30]
Dict{UTF8String,UTF8String} with 30 entries:
  "\u76c\u560ξɿ̮֭\u72d֙\u5… => "\u38b͈ǝ˫ȠڇЀѹԊ̚ѤӫץŃەǠвڙܡʆӉɜΎӪ֯Ѧ˥ǾIJϘпĈ\u7b2\u235ܩפՒ\…
  "ڠɵ"                     => "רĦ\u63bθєՑΒƭ̱ǡ"
  "ǫٷ\u221ңʹД٥ܷϞܧĈҸ\u61aА… => "ڪժƺ܅ʍ͢ƿԷȳ҂ڲŐȥ\u600֒\u60b٭\u777ܸľƹ\u63fح˝ІĄڡʹ\u7c7…
  "ۤɱ̡ȑ\u759ʄ݄Ŵӱ\u2faЖ\u76… => "س\u5c7ɸҚ\u23b̦Ĕ\u63fغ̈ʡΝͥűզڝĵ܅ƫΎŋʺՌΟ̭כٓƷ֮\u74c˚گ˓ŋծׄ…
  "ęѫ֣\u752܃ƈ\u756ƝͺјΉ΅Ϥ\… => "̟ۍƹگѵ\u51aȑӎ\u3faۼǕƥǥظչɼް\u381лڀڿأޡܘ݅ٷԾ͋ӟ\u61c\u75…
  ⋮                        => ⋮

@mbauman mbauman changed the title WIP: Moderately improved dict printing RFC: Moderately improved dict printing Jun 2, 2014
This is an attempt at improving the situation for issue JuliaLang#1759.  I've added
slightly enhanced `summary`s with information about the number of k/v pairs for
Associative and Key/ValueIterator types.

This PR keeps the old behavior (mostly) as `showcompact`. I then added new
functionality in `show` and `showlimited` for Associative and
Key/ValueIterators, printing newlines as delimiters between key/value pairs.
When output is limited, keys are truncated to the left third of the TTY screen
and values are truncated at newlines or the screen edge, with the `=>`
separators aligned.
@nalimilan
Copy link
Member

Looks good, but I've spotted two details: on the first line, the text is cut too early, or the ending ] is not printed correctly (and same when calling values); on the last line, => is not aligned with those above.

  "key2"  => [0.4160632908379072 0.11721273749843224…
  "key8"  => [-0.6246048294019402 -0.26452826097987675 0.741202984789746 1.1962…
  ⋮       => ⋮

@mbauman
Copy link
Member Author

mbauman commented Jun 2, 2014

The first issue is because there's no way for me to ask the values to show themselves in the space I have. Without revamping the whole show hierarchy, the best dict can do is arbitrarily truncate... at which point I don't know if there should be a trailing ] or ", for example. "key2" is a 2x2 array, so it's getting truncated at the newline between rows 1 and 2. Compare against showall and the current implementation.

The second is a font issue. If your monospace font doesn't have those glyphs, a proportional font is substituted instead. Not much we can do there. On iOS that last => appears slightly to the right, and on OS X it's slightly to the left (as viewed on this webpage). But within a good terminal it should appear aligned for reasonable Unicode content (the last example isn't aligned, but it doesn't overflow lines, either).

@nalimilan
Copy link
Member

I see, thanks!

JeffBezanson added a commit that referenced this pull request Jun 2, 2014
RFC: Moderately improved dict printing
@JeffBezanson JeffBezanson merged commit c674ba6 into JuliaLang:master Jun 2, 2014
@StefanKarpinski
Copy link
Member

Sorry for not testing earlier – this is the first thing I tried:

julia> d = Dict()
Dict{Any,Any} with 0 entries

julia> d[1] = "foo"
"foo"

julia> d
Dict{Any,Any} with 1 entry:
   => "foo"

So, minor tweaking is required for this.

@JeffBezanson
Copy link
Member

Should be fixed now.

@mbauman
Copy link
Member Author

mbauman commented Jun 3, 2014

Argh, so sorry I missed that trivial case. Thanks for the quick fix, Jeff.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants