Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

better dict printing #1759

Closed
StefanKarpinski opened this issue Dec 14, 2012 · 10 comments
Closed

better dict printing #1759

StefanKarpinski opened this issue Dec 14, 2012 · 10 comments
Labels
help wanted Indicates that a maintainer wants help on an issue or pull request
Milestone

Comments

@StefanKarpinski
Copy link
Member

We're very smart about printing arrays, but we're still in the stone age when it comes to printing dicts. We should elide large hashes, instead of dumping huge amounts of data on screen (we should never do that). I would prefer to print key-value pairs one-pair-per-line when the hash can't fit on a single line too. Ideally, returning a hash and printing it would be good enough looking that things like Pkg.status can return a map instead of formatting their output.

@ViralBShah
Copy link
Member

Also see #25 and #29.

@sglyon
Copy link
Contributor

sglyon commented Nov 6, 2013

Any update on this?

@StefanKarpinski
Copy link
Member Author

Nobody's done anything about it, but it would be lovely to have.

@sglyon
Copy link
Contributor

sglyon commented Nov 6, 2013

I am working more with dicts on a few projects and would love to see this.

Any suggestions or comments for how they should be printed?

@StefanKarpinski
Copy link
Member Author

It's a bit tricky. If they're short enough, then one line is good. If they're longer than that, then one key-value pair per line would be better.

@timholy
Copy link
Member

timholy commented Nov 6, 2013

If it's at all helpful, Images has a little code to do one pair per line printing.

@StefanKarpinski
Copy link
Member Author

I don't think the hard part is printing one pair per line, it's deciding how to print things without going ahead and printing all of them first. So, for example, if you have a dict with a million entries, you still only want to print a screenful of them.

@sglyon
Copy link
Contributor

sglyon commented Nov 6, 2013

@timholy thanks for that line. It is definitely a start at getting where we need to go.

I have a few questions, but in asking them I apologize that I may make many comparisons to Python. Python is what I am most familiar with.

When python prints a dict at the interpreter, it internally calls repr on each key and value in the dict. I think this is a good idea and would be happy to see this for Julia dicts. My questions for this are:

  • When the REPL prints an object to screen, what function does it call on the object? In my experimentation, it seems that perhaps display is called at some point.
  • How do new types define how they are printed? Perhaps the show method?
  • Is there a Julia function to get a string representation of the "message" that is printed to the screen? Something equivalent to Python's repr? I see that there is a Julia repr, but it doesn't seem to return exactly what is printed in every case. For example, consider the following:
julia> x = [1:5];

julia> display(x)
5-element Array{Int64,1}:
 1
 2
 3
 4
 5

julia> repr(x)
"[1,2,3,4,5]"

Or another example with arrays (FWIW I think printing an array this large dumps way too much information on screen, but I also think that numpy doesn't show enough at times. Anyway, this is a conversation to be had in another issue):

julia> x = reshape([1:1000000], 1000, 1000);

julia> display(x)
1000x1000 Array{Int64,2}:
    1  1001  2001  3001  4001  5001  6001  7001  8001   9001    994001  995001  996001  997001  998001   999001
    2  1002  2002  3002  4002  5002  6002  7002  8002   9002     994002  995002  996002  997002  998002   999002
    3  1003  2003  3003  4003  5003  6003  7003  8003   9003     994003  995003  996003  997003  998003   999003
    4  1004  2004  3004  4004  5004  6004  7004  8004   9004     994004  995004  996004  997004  998004   999004
    5  1005  2005  3005  4005  5005  6005  7005  8005   9005     994005  995005  996005  997005  998005   999005
    6  1006  2006  3006  4006  5006  6006  7006  8006   9006    994006  995006  996006  997006  998006   999006
    7  1007  2007  3007  4007  5007  6007  7007  8007   9007     994007  995007  996007  997007  998007   999007
    8  1008  2008  3008  4008  5008  6008  7008  8008   9008     994008  995008  996008  997008  998008   999008
    9  1009  2009  3009  4009  5009  6009  7009  8009   9009     994009  995009  996009  997009  998009   999009
   10  1010  2010  3010  4010  5010  6010  7010  8010   9010     994010  995010  996010  997010  998010   999010
   11  1011  2011  3011  4011  5011  6011  7011  8011   9011    994011  995011  996011  997011  998011   999011
   12  1012  2012  3012  4012  5012  6012  7012  8012   9012     994012  995012  996012  997012  998012   999012
   13  1013  2013  3013  4013  5013  6013  7013  8013   9013     994013  995013  996013  997013  998013   999013
   14  1014  2014  3014  4014  5014  6014  7014  8014   9014     994014  995014  996014  997014  998014   999014
   15  1015  2015  3015  4015  5015  6015  7015  8015   9015     994015  995015  996015  997015  998015   999015
   16  1016  2016  3016  4016  5016  6016  7016  8016   9016    994016  995016  996016  997016  998016   999016
   17  1017  2017  3017  4017  5017  6017  7017  8017   9017     994017  995017  996017  997017  998017   999017
   18  1018  2018  3018  4018  5018  6018  7018  8018   9018     994018  995018  996018  997018  998018   999018
   19  1019  2019  3019  4019  5019  6019  7019  8019   9019     994019  995019  996019  997019  998019   999019
   20  1020  2020  3020  4020  5020  6020  7020  8020   9020     994020  995020  996020  997020  998020   999020
   21  1021  2021  3021  4021  5021  6021  7021  8021   9021    994021  995021  996021  997021  998021   999021
   22  1022  2022  3022  4022  5022  6022  7022  8022   9022     994022  995022  996022  997022  998022   999022
   23  1023  2023  3023  4023  5023  6023  7023  8023   9023     994023  995023  996023  997023  998023   999023
   24  1024  2024  3024  4024  5024  6024  7024  8024   9024     994024  995024  996024  997024  998024   999024
   25  1025  2025  3025  4025  5025  6025  7025  8025   9025     994025  995025  996025  997025  998025   999025
   26  1026  2026  3026  4026  5026  6026  7026  8026   9026    994026  995026  996026  997026  998026   999026
   27  1027  2027  3027  4027  5027  6027  7027  8027   9027     994027  995027  996027  997027  998027   999027
   28  1028  2028  3028  4028  5028  6028  7028  8028   9028     994028  995028  996028  997028  998028   999028
   29  1029  2029  3029  4029  5029  6029  7029  8029   9029     994029  995029  996029  997029  998029   999029
                                                                                                            
  972  1972  2972  3972  4972  5972  6972  7972  8972   9972     994972  995972  996972  997972  998972   999972
  973  1973  2973  3973  4973  5973  6973  7973  8973   9973     994973  995973  996973  997973  998973   999973
  974  1974  2974  3974  4974  5974  6974  7974  8974   9974     994974  995974  996974  997974  998974   999974
  975  1975  2975  3975  4975  5975  6975  7975  8975   9975     994975  995975  996975  997975  998975   999975
  976  1976  2976  3976  4976  5976  6976  7976  8976   9976    994976  995976  996976  997976  998976   999976
  977  1977  2977  3977  4977  5977  6977  7977  8977   9977     994977  995977  996977  997977  998977   999977
  978  1978  2978  3978  4978  5978  6978  7978  8978   9978     994978  995978  996978  997978  998978   999978
  979  1979  2979  3979  4979  5979  6979  7979  8979   9979     994979  995979  996979  997979  998979   999979
  980  1980  2980  3980  4980  5980  6980  7980  8980   9980     994980  995980  996980  997980  998980   999980
  981  1981  2981  3981  4981  5981  6981  7981  8981   9981    994981  995981  996981  997981  998981   999981
  982  1982  2982  3982  4982  5982  6982  7982  8982   9982     994982  995982  996982  997982  998982   999982
  983  1983  2983  3983  4983  5983  6983  7983  8983   9983     994983  995983  996983  997983  998983   999983
  984  1984  2984  3984  4984  5984  6984  7984  8984   9984     994984  995984  996984  997984  998984   999984
  985  1985  2985  3985  4985  5985  6985  7985  8985   9985     994985  995985  996985  997985  998985   999985
  986  1986  2986  3986  4986  5986  6986  7986  8986   9986    994986  995986  996986  997986  998986   999986
  987  1987  2987  3987  4987  5987  6987  7987  8987   9987     994987  995987  996987  997987  998987   999987
  988  1988  2988  3988  4988  5988  6988  7988  8988   9988     994988  995988  996988  997988  998988   999988
  989  1989  2989  3989  4989  5989  6989  7989  8989   9989     994989  995989  996989  997989  998989   999989
  990  1990  2990  3990  4990  5990  6990  7990  8990   9990     994990  995990  996990  997990  998990   999990
  991  1991  2991  3991  4991  5991  6991  7991  8991   9991    994991  995991  996991  997991  998991   999991
  992  1992  2992  3992  4992  5992  6992  7992  8992   9992     994992  995992  996992  997992  998992   999992
  993  1993  2993  3993  4993  5993  6993  7993  8993   9993     994993  995993  996993  997993  998993   999993
  994  1994  2994  3994  4994  5994  6994  7994  8994   9994     994994  995994  996994  997994  998994   999994
  995  1995  2995  3995  4995  5995  6995  7995  8995   9995     994995  995995  996995  997995  998995   999995
  996  1996  2996  3996  4996  5996  6996  7996  8996   9996    994996  995996  996996  997996  998996   999996
  997  1997  2997  3997  4997  5997  6997  7997  8997   9997     994997  995997  996997  997997  998997   999997
  998  1998  2998  3998  4998  5998  6998  7998  8998   9998     994998  995998  996998  997998  998998   999998
  999  1999  2999  3999  4999  5999  6999  7999  8999   9999     994999  995999  996999  997999  998999   999999
 1000  2000  3000  4000  5000  6000  7000  8000  9000  10000     995000  996000  997000  998000  999000  1000000

julia> repr_x = repr(x);

julia> length(repr_x)
7893025

julia> print(repr_x[1:1000])
1000x1000 Array{Int64,2}:
    1  1001  2001  3001  4001  5001  6001  7001  8001   9001  10001  11001  12001  13001  14001  15001  16001  17001  18001  19001  20001  21001  22001  23001  24001  25001  26001  27001  28001  29001  30001  31001  32001  33001  34001  35001  36001  37001  38001  39001  40001  41001  42001  43001  44001  45001  46001  47001  48001  49001  50001  51001  52001  53001  54001  55001  56001  57001  58001  59001  60001  61001  62001  63001  64001  65001  66001  67001  68001  69001  70001  71001  72001  73001  74001  75001  76001  77001  78001  79001  80001  81001  82001  83001  84001  85001  86001  87001  88001  89001  90001  91001  92001  93001  94001  95001  96001  97001  98001   99001  100001  101001  102001  103001  104001  105001  106001  107001  108001  109001  110001  111001  112001  113001  114001  115001  116001  117001  118001  119001  120001  121001  122001  123001  124001  125001  126001  127001  128001  129001  130001  131001  132001  133001  134001  1

Notice that the return from repr(x) is clearly not the same as what is printed with display(x).

The next set of questions are more design related. The big question I would like an answer to is "how much information should be printed for large dicts?" I have thought of a few sub-questions that we should consider that might help us answer this question:

  • How should we distinguish which elements of the dict are shown and which ones aren't? Unlike an array or tuple, the dict is un-ordered so it is not as natural to print the first n items, then a ..., then last n items.
  • What do we do if each key-value pair in the dict has a large return value from repr?
  • Assume we collect a big string for the dict using key => value\n, where value = repr(key) for each key-value pair in the dict. How do we then decide how to cut the size of the printed value for the dict? Should we limit to a fixed number of characters? To a number of lines?
    • A related question: how do we decide how much information is too much for one line? Again, should we limit the one-line print messages to a fixed number of characters? A fixed number of key-value pairs? I have done some experimenting with python, which I will show now:
In [56]: a12_1 = {i:i for i in range(12)}; a12_1
Out[56]: {0: 0, 1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8, 9: 9, 10: 10, 11: 11}

In [57]: a13_1 = {i:i for i in range(13)}; a13_1
Out[57]: 
{0: 0,
 1: 1,
 2: 2,
 3: 3,
 4: 4,
 5: 5,
 6: 6,
 7: 7,
 8: 8,
 9: 9,
 10: 10,
 11: 11,
 12: 12}

In [58]: a12_2 = {i:i+10 for i in range(12)}; a12_2
Out[58]: 
{0: 10,
 1: 11,
 2: 12,
 3: 13,
 4: 14,
 5: 15,
 6: 16,
 7: 17,
 8: 18,
 9: 19,
 10: 20,
 11: 21}

In [59]: a11_2 = {i:i+10 for i in range(11)}; a11_2
Out[59]: {0: 10, 1: 11, 2: 12, 3: 13, 4: 14, 5: 15, 6: 16, 7: 17, 8: 18, 9: 19, 10: 20}

In [60]: map(lambda x: len(repr(x)), [a12_1, a13_1, a12_2, a11_2])
Out[60]: [76, 84, 86, 78]

In [68]: {1:"-"*35, 2:'*'*30}
Out[68]: {1: '-----------------------------------', 2: '******************************'}

In [69]: {1:"-"*35, 2:'*'*31}
Out[69]: 
{1: '-----------------------------------',
 2: '*******************************'}

In [71]: len(repr({1:"-"*35, 2:'*'*30}))
Out[71]: 79

In [72]: len(repr({1:"-"*35, 2:'*'*31}))
Out[72]: 80

From the experiment, it seems that Python prints dicts on one line if there will be 79 or fewer characters, and moves to multi-line printing for anything 80 characters and above. Is that something we would like to emulate here?

I'm sure there are more questions to be asked in order for us to get this right, but that brain dump should suffice for now.

@BobPortmann
Copy link
Contributor

I feel that "less is more" when it comes to printing dicts (i.e., get the overall picture and hammer down in if you want more info about the individual elements). The current dict printing is unusable when the objects in the dict get large (pretty much always the case for me). I think that printing the elements with each entry on one line is best and I favor only showing the key and type of the value for most types. For simple types (strings and numbers) also showing the value is nice but only up to the end of the line. For arrays, just show the type and size (not the contents). I have some code that implements this on my other computer that I can post tonight to see what people think. Perhaps if there was a generalized way to ask a type to print a short (less than one line) output of its content then that could be used (but I think this would work best if the caller could ask for a max size and the routine could return an empty string if it could not do it).

For the case of a dict with many things in it I think printing one page and then asking to continue or quit (like less or more) is the way to go. In fact this would help in other places in the repl as well. Maybe when the new repl.jl is introduced it could include a pager.

mbauman added a commit to mbauman/julia that referenced this issue Feb 6, 2014
This is a modest attempt at improving the situation for issue JuliaLang#1759.  I've added
slightly enhanced `summary`s with information about the number of k/v pairs for
Associative and Key/ValueIterator types.

I'm still slightly confused by all the different methods used for show, but
this PR keeps the old behavior (mostly) as `showcompact`. I then added new
functionality in `show` and `showlimited`, using newlines as delimiters between
key/value pairs.  When output is limited, values are truncated at newlines or
the TTY screen edge and a limited number of pairs are printed.

The key and value iterators no longer spit out entire dictionaries when shown.
They instead show a limited {} array of the keys and values.
mbauman added a commit to mbauman/julia that referenced this issue Feb 17, 2014
This is a modest attempt at improving the situation for issue JuliaLang#1759.  I've added
slightly enhanced `summary`s with information about the number of k/v pairs for
Associative and Key/ValueIterator types.

I'm still slightly confused by all the different methods used for show, but
this PR keeps the old behavior (mostly) as `showcompact`. I then added new
functionality in `show` and `showlimited`, using newlines as delimiters between
key/value pairs.  When output is limited, values are truncated at newlines or
the TTY screen edge and a limited number of pairs are printed.

The key and value iterators no longer spit out entire dictionaries when shown.
They instead show a limited {} array of the keys and values.
mbauman added a commit to mbauman/julia that referenced this issue Mar 8, 2014
This is a modest attempt at improving the situation for issue JuliaLang#1759.  I've added
slightly enhanced `summary`s with information about the number of k/v pairs for
Associative and Key/ValueIterator types.

I'm still slightly confused by all the different methods used for show, but
this PR keeps the old behavior (mostly) as `showcompact`. I then added new
functionality in `show` and `showlimited`, using newlines as delimiters between
key/value pairs.  When output is limited, values are truncated at newlines or
the TTY screen edge and a limited number of pairs are printed.

The key and value iterators no longer spit out entire dictionaries when shown.
They instead show a limited {} array of the keys and values.
mbauman added a commit to mbauman/julia that referenced this issue Mar 8, 2014
This is a modest attempt at improving the situation for issue JuliaLang#1759.  I've added
slightly enhanced `summary`s with information about the number of k/v pairs for
Associative and Key/ValueIterator types.

I'm still slightly confused by all the different methods used for show, but
this PR keeps the old behavior (mostly) as `showcompact`. I then added new
functionality in `show` and `showlimited`, using newlines as delimiters between
key/value pairs.  When output is limited, values are truncated at newlines or
the TTY screen edge and a limited number of pairs are printed.

The key and value iterators no longer spit out entire dictionaries when shown.
They instead show a limited {} array of the keys and values.
mbauman added a commit to mbauman/julia that referenced this issue Mar 10, 2014
This is a modest attempt at improving the situation for issue JuliaLang#1759.  I've added
slightly enhanced `summary`s with information about the number of k/v pairs for
Associative and Key/ValueIterator types.

I'm still slightly confused by all the different methods used for show, but
this PR keeps the old behavior (mostly) as `showcompact`. I then added new
functionality in `show` and `showlimited`, using newlines as delimiters between
key/value pairs.  When output is limited, values are truncated at newlines or
the TTY screen edge and a limited number of pairs are printed.

The key and value iterators no longer spit out entire dictionaries when shown.
They instead show a limited {} array of the keys and values.
mbauman added a commit to mbauman/julia that referenced this issue Mar 18, 2014
This is a modest attempt at improving the situation for issue JuliaLang#1759.  I've added
slightly enhanced `summary`s with information about the number of k/v pairs for
Associative and Key/ValueIterator types.

This PR keeps the old behavior (mostly) as `showcompact`. I then added new
functionality in `show` and `showlimited`, using newlines as delimiters between
key/value pairs.  When output is limited, values are truncated at newlines or
the TTY screen edge and a limited number of pairs are printed.

The key and value iterators no longer spit out entire dictionaries when shown.
They instead show a limited {} array of the keys and values.
mbauman added a commit to mbauman/julia that referenced this issue Jun 2, 2014
This is a modest attempt at improving the situation for issue JuliaLang#1759.  I've added
slightly enhanced `summary`s with information about the number of k/v pairs for
Associative and Key/ValueIterator types.

This PR keeps the old behavior (mostly) as `showcompact`. I then added new
functionality in `show` and `showlimited`, using newlines as delimiters between
key/value pairs.  When output is limited, values are truncated at newlines or
the TTY screen edge and a limited number of pairs are printed.

The key and value iterators no longer spit out entire dictionaries when shown.
They instead show a limited {} array of the keys and values.
mbauman added a commit to mbauman/julia that referenced this issue Jun 2, 2014
This is an attempt at improving the situation for issue JuliaLang#1759.  I've added
slightly enhanced `summary`s with information about the number of k/v pairs for
Associative and Key/ValueIterator types.

This PR keeps the old behavior (mostly) as `showcompact`. I then added new
functionality in `show` and `showlimited` for Associative and
Key/ValueIterators, printing newlines as delimiters between key/value pairs.
When output is limited, keys are truncated to the left third of the TTY screen
and values are truncated at newlines or the screen edge, with the `=>`
separators aligned.
mbauman added a commit to mbauman/julia that referenced this issue Jun 2, 2014
This is an attempt at improving the situation for issue JuliaLang#1759.  I've added
slightly enhanced `summary`s with information about the number of k/v pairs for
Associative and Key/ValueIterator types.

This PR keeps the old behavior (mostly) as `showcompact`. I then added new
functionality in `show` and `showlimited` for Associative and
Key/ValueIterators, printing newlines as delimiters between key/value pairs.
When output is limited, keys are truncated to the left third of the TTY screen
and values are truncated at newlines or the screen edge, with the `=>`
separators aligned.
@quinnj
Copy link
Member

quinnj commented Jun 4, 2014

Closed by #5706?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Indicates that a maintainer wants help on an issue or pull request
Projects
None yet
Development

No branches or pull requests

7 participants