Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data load takes longer in kona than in k2.8 (mmap?) #632

Open
tavmem opened this issue Aug 30, 2022 · 6 comments
Open

Data load takes longer in kona than in k2.8 (mmap?) #632

tavmem opened this issue Aug 30, 2022 · 6 comments

Comments

@tavmem
Copy link
Collaborator

tavmem commented Aug 30, 2022

Using scrabble-puz.k from issue #615
in kona:

$ rlwrap -n ./k
kona      \ for help. \\ to exit.
  \t \l scrabble-puz.k
8263

In k2.8

$ rlwrap -n ./k
K 2.8 2000-10-10 Copyright (C) 1993-2000 Kx Systems 
\ for help. \\ to exit.
  \t \l scrabble-puz.k
32
@tavmem
Copy link
Collaborator Author

tavmem commented Aug 30, 2022

On a related note, the second step in issue #615 takes more than 40 times longer in kona than in k2.8
In kona it took 177 milliseconds

 rlwrap -n ./k
kona      \ for help. \\ to exit.

  \l scrabble-puz.k
  #puz
8473
  "scrabblepuz" 1: puz
  \t puz2: 1: `"scrabblepuz"
177
  #puz2
8473
  ^/^/^/puz=puz2
1.0

In k2.8, it took 4 milliseconds

$ rlwrap -n ./k
K 2.8 2000-10-10 Copyright (C) 1993-2000 Kx Systems
\ for help. \\ to exit.

  \l scrabble-puz.k
  #puz
8473
  "scrabblepuz" 1: puz
  \t puz2: 1: "scrabblepuz"
4
  #puz2
8473
  ^/^/^/puz=puz2
1.0

Bakul mentioned that kona appears to be using mmap many more times than necessry.
That may be the reason for the slowness in one, or in both of these cases.

@bakul
Copy link
Contributor

bakul commented Aug 30, 2022

See #615 (comment)
You can try running each k under strace (on linux) with a much smaller list.

@tavmem
Copy link
Collaborator Author

tavmem commented Aug 30, 2022

Thanks!
Looks like you are correct about the number of mmap calls.
On the simple command "file" 1: (4 5) kona calls mmap 3 times
k2.8 calls mmap2 once

kona:

"file" 1: (4 5)
) = 1 (in [0])
newfstatat(0, "", {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0), ...}, AT_EMPTY_PATH) = 0
read(0, "\"file\" 1: (4 5)\n", 1024)    = 16
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {tv_sec=0, tv_nsec=2017642}) = 0
mmap(NULL, 256, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f8aefd81000
mmap(NULL, 128, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f8aefd80000
openat(AT_FDCWD, "file.K", O_RDWR|O_CREAT|O_TRUNC, 07777) = 3
ftruncate(3, 48)                        = 0
mmap(NULL, 48, PROT_WRITE, MAP_SHARED, 3, 0) = 0x7f8aefd7f000
close(3)                                = 0
munmap(0x7f8aefd7f000, 48)              = 0
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {tv_sec=0, tv_nsec=2805125}) = 0
write(1, "  ", 2  )                       = 2
pselect6(1, [0], NULL, NULL, NULL, NULL

k2.8

"file" 1: (4 5)
) = 1 (in [0])
alarm(1)                                = 0
alarm(0)                                = 1
statx(0, "", AT_STATX_SYNC_AS_STAT|AT_NO_AUTOMOUNT|AT_EMPTY_PATH, STATX_BASIC_STATS, {stx_mask=STATX_BASIC_STATS|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFCHR|0620, stx_size=0, ...}) = 0
read(0, "\"file\" 1: (4 5)\n", 1024)    = 16
alarm(1)                                = 0
stat64("file.l", {st_mode=S_IFREG|0644, st_size=56, ...}) = 0
unlink("file.l")                        = 0
openat(AT_FDCWD, "file.l", O_RDWR|O_CREAT, 0666) = 3
ftruncate(3, 24)                        = 0
mmap2(NULL, 24, PROT_READ|PROT_WRITE, MAP_SHARED, 3, 0) = 0xf7f2c000
close(3)                                = 0
msync(0xf7f2c000, 24, MS_SYNC)          = 0
munmap(0xf7f2c000, 24)                  = 0
write(2, "  ", 2  )                       = 2
alarm(0)                                = 1
pselect6(1000, [0], [], NULL, NULL, NULL

@tavmem
Copy link
Collaborator Author

tavmem commented Aug 30, 2022

For the associated command 1: "file", that follows the command "file" 1: (4 5)
kona calls mmap twice.
k2.8 calls mmap2 only once.

kona:

1:"file"
) = 1 (in [0])
read(0, "1:\"file\"\n", 1024)           = 9
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {tv_sec=0, tv_nsec=3016329}) = 0
openat(AT_FDCWD, "file.K", O_RDWR)      = 3
newfstatat(AT_FDCWD, "file.K", {st_mode=S_IFREG|S_ISVTX|0755, st_size=48, ...}, 0) = 0
mmap(NULL, 48, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_NORESERVE, 3, 0) = 0x7f8aefd7f000
newfstatat(3, "", {st_mode=S_IFREG|S_ISVTX|0755, st_size=48, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 48, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_NORESERVE, 3, 0) = 0x7f8aefd7e000
close(3)                                = 0
munmap(0x7f8aefd7f000, 48)              = 0
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {tv_sec=0, tv_nsec=3776268}) = 0
write(1, "4 5\n", 44 5
)                    = 4
munmap(0x7f8aefd7e000, 48)              = 0
write(1, "  ", 2  )                       = 2
pselect6(1, [0], NULL, NULL, NULL, NULL

k2.8

1: "file"
) = 1 (in [0])
alarm(1)                                = 0
alarm(0)                                = 1
read(0, "1: \"file\"\n", 1024)          = 10
alarm(1)                                = 0
stat64("file.l", {st_mode=S_IFREG|0644, st_size=24, ...}) = 0
openat(AT_FDCWD, "file.l", O_RDONLY)    = 3
_llseek(3, 0, [24], SEEK_END)           = 0
mmap2(NULL, 24, PROT_READ|PROT_WRITE, MAP_PRIVATE, 3, 0) = 0xf7f2c000
close(3)                                = 0
statx(1, "", AT_STATX_SYNC_AS_STAT|AT_NO_AUTOMOUNT|AT_EMPTY_PATH, STATX_BASIC_STATS, {stx_mask=STATX_BASIC_STATS|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFCHR|0620, stx_size=0, ...}) = 0
write(1, "4 5", 34 5)                      = 3
write(1, "\n", 1
)                       = 1
munmap(0xf7f2c000, 24)                  = 0
write(2, "  ", 2  )                       = 2
alarm(0)                                = 1
pselect6(1000, [0], [], NULL, NULL, NULL

@tavmem
Copy link
Collaborator Author

tavmem commented Sep 2, 2022

I just realized that this issue replicates the problem identified in issue #626
This issue does identify a 3rd culprit for the slowness (too many calls to mmap).
However, I'm closing this issue, since it is redundant.

@tavmem tavmem closed this as completed Sep 2, 2022
@tavmem tavmem mentioned this issue Sep 2, 2022
@tavmem tavmem reopened this Sep 6, 2022
@tavmem tavmem changed the title Data load takes 250 times longer in kona than in k2.8 Data load takes longer in kona than in k2.8 (mmap?) Sep 6, 2022
@tavmem
Copy link
Collaborator Author

tavmem commented Sep 6, 2022

As a reminder (see issue #626):
after fixing the placement of syntaxChk the kona result is

$ rlwrap -n ./k
kona      \ for help. \\ to exit.
  \t \l scrabble-puz.k
175

The result with k2.8

$ rlwrap -n ./k
K 2.8 2000-10-10 Copyright (C) 1993-2000 Kx Systems 
\ for help. \\ to exit.
  \t \l scrabble-puz.k
26

The most likely explanation is that kona uses mmap more than it needs to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants