Chat:World/2021-05-27

From CG community
Revision as of 12:07, 15 June 2021 by Chat Log (talk | contribs) (Created page with "<img src=/a/46863582565465> Chainman: Never give up <img src=/a/65039400862535> dscientist: Never surrender, never look back <img src=/a/46863582565465> Chainman: never gonn...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Chainman: Never give up

dscientist: Never surrender, never look back

Chainman: never gonna give you up

Default avatar.png Alemhan: never gonna let you down

Default avatar.png ASW: CoC is basically just AP computer science tests lol

dscientist: ok, but what is it? I am in 2 rooms CoC+aLongNumber.

Default avatar.png sscoot13: im just learning programming and when i do these clash of codes im just amazed at how compact other peoples code is compared to me lol

Default avatar.png bg117: lol same

Default avatar.png bg117: literally theirs is one-liner

KiwiTae: sscoot13 patpat

Default avatar.png sscoot13: patpat?

Default avatar.png walleta: b

Default avatar.png walleta: bb

Default avatar.png bg117: but i still get 100% for some reason

Default avatar.png sscoot13: haha same lol

Default avatar.png MatasSiu1E: Ching Chong

Default avatar.png bg117: stfu racist

Default avatar.png ArtLiteracyStory: yo this is such a good way to learn code

Default avatar.png ArtLiteracyStory: So fun

Default avatar.png bg117: ya i agree

jrke: i have an issue the vector size in c++ gets unexpectedly very big causing timeouts to me

jrke: why does this happen

ddreams: perhaps you're overwriting it elsewhere

KiwiTae: got a leak somewhere ah~

nicola: :leak:

jrke: are submissions not getting updated?

ddreams: ?

nicola: Which one?

ddreams: is there a problem with the site?

ddreams: some vague problem, somewhere

ddreams: pls help

jrke: for OOC my all battle completed but rank didn't changed at any moment

Hebo: is the site borked or is it just me

nicola: Oops...

jrke: thibaud said in discord that they are dowloadind database after challenge so there can 2 interuptions in service

jrke: "we're downgrading the DB after the challenge. You can expect 2 short interruptions of service"

Hebo: let the bass cannon kick it

Default avatar.png ArtLiteracyStory: hello world

MXgms: hi

Default avatar.png LinhT.Nguyen: lo world

Peached: brb back to the website

Peached: after like a year

Peached: gonna update my profile

Cybersick: Did you see the code?

Peached: why is the website failing in everything

jrke: they are downloading DB so interuption in service for now

Peached: for how long

jrke: not sure can be done in a min or hour or two hour

Peached: ok

jrke: i don't know that

Peached: imma go get dinner then

jrke: :thumbsup:

Default avatar.png AqilM: hallo

Default avatar.png AqilM: wie geht`s Ihnen?

AntiSquid: #de

YahyaBahjaoui: hy

YahyaBahjaoui: hey

Westicles: Now the default for code ala mode made it to brone :thinking:

Westicles: bronze

Peached: im back is it done

jacek: yeah, "WAIT" is sufficient for bronzee

Default avatar.png zakaria-wina: Hi !

Peached: is the db thing done

Default avatar.png b0ba5: knock knock

Peached: come in

Peached: lawl

Westicles: Oh, I see. Took it 3 months to get there though

jacek: wait 6 months and youll get silver

Westicles: I won't hold my breath

AntiSquid: you wouldn't last for more than a few minutes anyway

Alshock: either that or he'd hold it for a very long time

KalamariKing: are you assuming W​esticles is mortal

AntiSquid: unless it's a chatbot, it's mortal

Westicles: K​alamariKing , nice trick

Westicles: Got in trouble for too much pinging?

AntiSquid: "degenerate furry" :thinking:

jacek:

jacek: :thinking:

Greg_3141: why wont code a la mode load

KalamariKing: anti lol

KalamariKing: Westicles there's a zero-width space, didnt wanna ping

KalamariKing: West​icles

AntiSquid: you can turn off the ping notification westi cles , or you prefer breaded kalamari rings ?

AntiSquid: KalamariRings ping

jetenergy: hey, i have a small question concering "Code your own Automaton". I think im having trouble understanding the question regarding the most powerful word, does anybode know?

KalamariKing: wdym?

jetenergy: in the description they are talking about "chosing the most powerful word" but i have no idea what they mean by powerful, do they mean the word with the most children or the longest word

Uljahn: do you mean "the most likely"?

Westicles: most occurrences, then alphabetical

jetenergy: aha thank you westicles

MSmits: hey guys, I think I'm done with my TTT experimentation. Did a bunch of tests with keras and got some nice results

MSmits: https://imgur.com/a/FjaX0Y4

MSmits: https://imgur.com/a/1tUItJF

BlaiseEbuth: There's points. And curves. Blue and orange.

MSmits: indeed

jacek: lost games?

MSmits: second one is games played vs random player. Bot the NN and the random takes any immediate winning moves automatically.

MSmits: if no winning moves available, the random player randoms and NN uses network

AntiSquid: so what next ?

reCurse: Half of machine learning is about making pretty graphs, so good job

MSmits: thanks :)

AntiSquid: EDA is more than that @_@

MSmits: you can see it overfitting at some point, but i think there's a limit to this as you said reCurse

MSmits: I think generalization is not possible with so little data

MSmits: the bot is trained on 90% of states and the other 10% are validation

jacek: so little data... 100%?

MSmits: but the validation data is just not similar enough to training data

MSmits: because there is so little of it

MSmits: also because it's TTT, you expect it to play perfectly

MSmits: even though you'd never do that in a real game

MSmits: so the bar is too high

AntiSquid: are you sure you need validation data for this ?

MSmits: well if you're practicing it makes sense

MSmits: but if you're just looking for a good bot, then no

MSmits: because all possible data is available

MSmits: so generalization is pointless

reCurse: It's always a good idea to have some sort of clue about generalization yes

MSmits: yeah, for practicing machine learning this was a good idea

MSmits: it was quite obvious that with a network too big, it overfits. Nice learning experience

jacek: nice machine learning experience eh

MSmits: it didnt do that much better than my handcoded one. Adam optimizer was a big boost

AntiSquid: how many hidden layers in the ends ? it just says hidden nodes

reCurse: Probably because your SGD wasn't well tuned enough

MSmits: mean squared error over absolute was also a bit better

MSmits: yeah probably reCurse

MSmits: AntiSquid 1

MSmits: i could get better results sometimes when i added another layer

MSmits: but it's kinda hard to make graphs like these if the number of layers is also a variable

AntiSquid: you add color or shape as 3rd dimension

MSmits: sure, yeah and i would, if TTT was my goal

MSmits: oh

MSmits: i ran into an issue

AntiSquid: for graphs

MSmits: playing games with this network was extremely slow

MSmits: slower than with my handcoded one

reCurse: Python is slow? :o

MSmits: they are both python

AntiSquid: too many nodes

reCurse: Ah

MSmits: network was over 10x slower with Keras

MSmits: even if i had 1 hidden node

MSmits: instead of 20

reCurse: Yeah there's some overhead to API calls

reCurse: Especially if you use GPU

MSmits: no gpu. I think there is a problem with conversion of input to tensor

MSmits: and it does 30 predictions per game

MSmits: 1000 games means 30k predictions

jacek: so whats next step? RL?

reCurse: Maybe, python is a nightmare to profile in my experience

MSmits: jacek not sure, Iwas thinking maybe experimenting with oware supervised learning before trying it solely by selfplay

MSmits: that will require me to have an inferer so thats a lot of work already

MSmits: (in c++ i mean)

MSmits: i have a lot of data from meta mcts, so supervised should be workable

jacek: alright. i started with SL somewhat too

MSmits: expect this to take a month or so though, dont ask me every day :P

MSmits: TTT was easy

AntiSquid: is it ready yet ?

MSmits: lol

reCurse: Automaton2000, please learn to ping MSmits about it every day

Automaton2000: here is the full code

reCurse: Thanks?

MSmits: oh right jacek you taught your NN to predict what your other bot found with a search

MSmits: I remember that

Default avatar.png ikustom: bruuh

AntiSquid: 20 blocked calls from the same number in just 1 minute, new record :o

jacek: your mum?

VizGhar: oO

jrke: 3secs per call

jrke: quite fast

Default avatar.png Zimtime907: Hi Guys!

Default avatar.png Zimtime907: :grinning:

AntiSquid: jacek being a cunt again

jacek: Oo

Default avatar.png Zimtime907: Wait what?!:confused:

jacek: typical mod's language

Default avatar.png Zimtime907: Ahhhh...

reCurse: Hey no generalization that's just him

Default avatar.png Zimtime907: Okay...

Default avatar.png Zimtime907: Sure

Default avatar.png Zimtime907: Mabye Mabye not

BlaiseEbuth: Yeah don't worry, AntiSquid is a cunt...

reCurse: Ok stop with the unnecessary language

BlaiseEbuth: :speak_no_evil:

reCurse: Some mods we got

BlaiseEbuth: My apologies your Eminence. I didn't want to hurt your chaste ears. :bow:

reCurse: :roll_eyes:

Greg_3141: heard that I could get points in code a la mode just by clicking the submit button

AntiSquid: what the fuck is your problem BlaiseEbuth? dipshit

AntiSquid: jacek no need to be an ass and bring my mum into the convo, your trolling goes too far

reCurse: Geez you need to take a chill pill

BlaiseEbuth: Shhh ! Be kind or reCurse will scold you.

reCurse: I mean come on we ban people on a daily basis for stuff like that

reCurse: Can you please?

Scarfield: +1

AntiSquid: not my fault

reCurse: Right

AntiSquid: stuff like that lol

BlaiseEbuth: I never banned anybody for saying c***t... This chan is so touchy... :unamused:

AntiSquid: deserved

reCurse: Real kids...

AntiSquid: you too now?

AntiSquid: really?

AntiSquid: just let it go

Greg_3141: if you get banned, do you get banned from the chat or the website?

KalamariKing: the mods are arguing about how touchy the channel is... maybe the channel mirrors the mods

jOE: Hi!

AntiSquid: i thought to not kick him for that ... but whatever guess i should have

KalamariKing: Hello

AntiSquid: good point KalamariKing

reCurse: You're right, it's certainly not the kind of channel I want to stick around.

BlaiseEbuth: Probably. That's why I'm not so often here...

AntiSquid: bye

AntiSquid: if you have a problem take it on with guilty stop the drama

reCurse: If you don't see the irony in flying off the handle at a mom joke and calling people c**ts and pretending to be a moderator, I'm not gonna bother :)

AntiSquid: talking from a high horse again

ZarthaxX: peace in chat please :(

AntiSquid: ya better don't i prefer when you don'rt

ZarthaxX: also hi

BlaiseEbuth: :popcorn:

KalamariKing: howdy

ZarthaxX: gucci

MSmits: hi ZarthaxX

ZarthaxX: you kalamari

ZarthaxX: hi smito :O

Scarfield: Hi Zarthie

ZarthaxX: SCARFO!

AntiSquid: you could just not get involved reCurse, nobody forced you too

AntiSquid: simple

KalamariKing: mods fighting + horses = jousting?

reCurse: I'm doing my job actually

AntiSquid: then don't take it on with me

reCurse: You're not doing yours, so I will

ZarthaxX: how is that nn smito

AntiSquid: so he's kicking mud, but good you blame me

AntiSquid: typical

ZarthaxX: scarfo you have been missing dont you? mhm

KalamariKing: take this out like big bois, get out your jousting large-toothpick-things

MSmits: ZarthaxX i am done with TTT. I'll pm you two graphs

AntiSquid: and good you added your own ad hominid and still pretend you did the right thing, keep adding insult to the injury reCurse

Scarfield: havent been on much the last few weeks true, but not missing entirely :)

BlaiseEbuth: Hold on. I gonna grab more popcorn.

Scarfield: tbh its not reBless who is adding insult to injury. but i would appreciate it if this "argument" was over

KalamariKing: alr now I'm being serious why are you two just throwing insults at each other? either agree to disagree, or just drop it

reCurse: Can you point me to the part where I insulted him?

AntiSquid: trolling ?

ZarthaxX: Scarfield ah ok, missed you mah man

KalamariKing: "you're not doing yours, so I will"

AntiSquid: can't see the problem from your own ego i guess, reCurse

reCurse: Geez looks like you had a beef with me just waiting for an occasion

KalamariKing: anti over here is just popping off

AntiSquid: not at all

Scarfield: you too Zarthoo :kiss:

KalamariKing: reCurse you're not doing much but both of you chill

AntiSquid: i am just saying it as is, thanks for proving my point

QAT: you guys are weird

AntiSquid: thanks QAT, love you too

KalamariKing: chat's not normally so toxic, they usually kick arguments but up on their high horses, the rules are different apparently

reCurse: Sorry if asking to act decently is 'being on a high horse'

reCurse: I'll leave

Greg_3141: incredible, my lazy chef has been promoted to wood 1

ZarthaxX: Scarfield :hug:

QAT: just take it a discord voice channel and turn the gain on your mics like real men

ZarthaxX: :hugging:

KalamariKing: you don't have to leave, you all should just stop arguing (specifically anti... just shut)

AntiSquid: no it's the way you ask and the timing

AntiSquid: bad timing, missing part of the problem entirely, ya the selective filtering and then the bullshit you throw at me for now reason, pretending to do your job haha

AntiSquid: "self-righteous" perfect

KalamariKing: my guy

KalamariKing: youre beating a dead horse

AntiSquid: he's not dead yet

KalamariKing: youre missing the point

Scarfield: https://imgur.com/a/FjaX0Y4 https://imgur.com/a/1tUItJF ZarthaxX smito shared these ealier about his TTT NN :)

ZarthaxX: he pmed me those just now yes

ZarthaxX: sweet graphs :)

KalamariKing: it might be an american expression AntiSquid but it means drop the argument, he's already gone

JSboss: guys whats your favorite coding soundtrack. i always bump Miles Davis

AntiSquid: i know what it means, was joking KalamariKing @_@

QAT: @jsboss david shawty

Wontonimo: oh, those are nice graphs!

MSmits: hey Wontonimo

MSmits: yeah happy with them. They seem very explainable/unsurprising to someone who knows ML right?

MSmits: or do you see something weird?

KalamariKing: they look pretty good to me

MSmits: it's ridiculously easy to use Keras/tf for this

MSmits: it took me less than a day to convert to that

MSmits: you just have to look up the api calls and such when you try different things. And of course there's the googling what it all means

KalamariKing: see?

KalamariKing: tf makes things easy, plus its all optimized cpp so its much faster

MSmits: the training is

KalamariKing: are you still using numpy datasets tho?

MSmits: doing a single prediction was extremely slow

MSmits: over 10x slower than my handcoded network

KalamariKing: that might be a bit broken...

MSmits: I was using numpy yes

KalamariKing: switch to tf.data

MSmits: probably the conversion from numpy to tensor was slowing me down

MSmits: my handcoded network didnt need to do that

KalamariKing: yeah, it has to do all that behind the scenes

KalamariKing: once it's cached as a tensor (e.x. training) its much faster

MSmits: how do you use tf.data?

KalamariKing: Thats a good question

MSmits: is this usuable in place of numpy arrays?

KalamariKing: I believe so?

Wontonimo: those graphs look great MSmits. Nothing out of the ordinary, and furthermore they show that everything appears "normal" and good.

MSmits: well.. in the end it doesnt matter that much as i will be using a c++ inferer, but it was a bit annoying player 1k TTT games, doing 30k prediction and it taking over 10 mintues

MSmits: i type crap today

MSmits: good to hear Wontonimo

KalamariKing: predictions aren't supposed to take 20 seconds hmm

MSmits: eh no not 20 seconds

MSmits: 30k predictions in 10 minutes i said :0

MSmits: thats 50/second

KalamariKing: I can't do math today

KalamariKing: That sounds better

MSmits: still weird, it didnt matter if i had 1 hidden neuron or 20 either

Wontonimo: in particular, something that is VERY nice to see is that it gets better with more size. That means there is something regularizing your network. Perhaps now that you are using Adam.

MSmits: yeah adam helped a lot and mean square error

MSmits: that second one helped a bit

Wontonimo: also nice to see is that Validation didn't become worse with more size, again something is regularizing your network in a good way.

MSmits: it might be a bit different from your average classification nn though, because I train on all possible states

MSmits: well actually 90% of them

MSmits: 10% are validation

Wontonimo: a similar graph, but instead of size for the X axis, pick one size and replace the X axis with epochs. If the graph looks similar and has those same nice features then you've got a great setup

MSmits: hmm, i will not play games then, will just do the training. That will be fast

MSmits: size 25 seems to be good

Wontonimo: in particular, Validation following test and not exploding near the end. Next test continuously decreasing and not exploding at the end.

MSmits: mmh i should automate this

MSmits: can keras produce a graph of some kind?

MSmits: with an api call?

KalamariKing: kinda

KalamariKing: gimme a sec, switching classes, then I'll tell you what I did

Wontonimo: there is the whole tf thing for monitoring training

Wontonimo: tensorboard

KalamariKing: yeah that

KalamariKing: had to find it again, sorry

MSmits: ah there is also just matplotlif

KalamariKing: tensorboard's pretty ok

MSmits: lib

Wontonimo: here is a tensorboard playground link for later https://colab.research.google.com/github/tensorflow/tensorboard/blob/master/docs/tensorboard_in_notebooks.ipynb

Wontonimo: probably matplotlib will be your quick and easy friend for now

Wontonimo: tensorboard allows you to explore your network values, the graph, tensors, histograms of embeddings etc

MSmits: yeah i think i have it working, just waiting for training to finish

Wontonimo: it takes a bit to get tensorboard going and figure out the controls and it isn't particularly good for automation, but if you do a lot of NN work it is eventually worth the effort to get it going.

KalamariKing: really? I thought tb was quite easy

Default avatar.png Voudrais: Oups An error occurred (#412): "Could not get leaderboard in time: giving up...". I have this problem, please help me

Wontonimo: i'm dumb

MSmits: mmh fixing bugs with model history

KalamariKing: Voudrais is your connection stable? firewall blocking anything?

Wontonimo: KalamariKing or maybe tb has improved in the while. I just remember the first time using it was painful.

Default avatar.png Voudrais: I think yes, this error only happens when I'm clicking "Compete" in menu

KalamariKing: afaik it was just add tb in the callbacks, then call tensorboard --logdir logs in shell

Default avatar.png Voudrais: I can't see "Games" section, "Contest" and "Leaderboards" work normally

BlaiseEbuth: Hmm... Same

BlaiseEbuth: There was DB downgrade operations today, perhaps that's not over.

KalamariKing: They were doing db stuff a few hours ago I think

Wontonimo: the trouble i ran into was getting tensorboard to actually display anything

KalamariKing: That shell command was what displayed it

KalamariKing: I can send you a notebook if you want

MSmits: https://imgur.com/a/LwwPmOO

MSmits: this is with batchsize 10 instead of 1, i was impatient

KalamariKing: that's pretty good MSmits

MSmits: yeah I think so

KalamariKing: it learns pretty fast and there isn't much overfititng

Wontonimo: bigger batch size usually results in a network that is more generalized, depending on how your error is calculated.

Wontonimo: Yeah, the graph looks fantastic

Wontonimo: very very well behaved network which avoids the two pitfalls i was talking about earlier

MSmits: and I have learned a way to graph stuff now too! matplotlib is nice!

MSmits: tensorboard will come later i suppose, for now this'll do

KalamariKing: https://colab.research.google.com/drive/1j8hWE8WMDyo7x8lobGESQ_y9_NbHadEg?authuser=1#scrollTo=KBHp6M_zgjp4

KalamariKing: if you want it, thats a basic mnist nn, but it uses tb

MSmits: no access

Wontonimo: There is a noise component in the Validation. You can think of this as learning and unlearning lessons from being overtrained. A lower learning rate and larger batch size may smooth that out.

KalamariKing: oh forgot about that lol

MSmits: hmm ok

MSmits: will try those two things Wontonimo

KalamariKing: https://colab.research.google.com/drive/1j8hWE8WMDyo7x8lobGESQ_y9_NbHadEg?usp=sharing

KalamariKing: there

Wontonimo: you mentioned that your learning rate is "already really low". Learning rates in Adam seem really low, but values as low as 1e-8 can still work

Default avatar.png UncertainLeo: lua cool

Default avatar.png UncertainLeo: any other language cooln't

MSmits: hmm no it's not that low currently. It's 0.001

MSmits: looks good KalamariKing

MSmits: batchsize from 10 to 20 was worse, trying learning rate 10x smaller now

Wontonimo: Andrew Ng's advice on selecting a learning rate is to just try some x3 and /3 values. What you want is something that takes a couple epochs achieve good results, not too fast, not too slow.

MSmits: ahh, not sure what a couple is though

Wontonimo: 2-3

Wontonimo: not great results, but meaningful progress.

MSmits: I was using 400 before.

MSmits: now 1000 because of low learning rate

MSmits: I dont have that much data, maybe thats why i need more epochs

Wontonimo: but in about 2-3 epochs you can see that it is doing well already, even in the previous graph you sent!

Wontonimo: that's all i'm talking about

MSmits: ohh ok

MSmits: I thought you meant 2-3 to finish training

Default avatar.png AqilM: hallo

MSmits: https://imgur.com/a/N2PLCyb

MSmits: with lr 0.0001

Wontonimo: a good sign of a learning rate that is too high is that the error rate decreases and everything looks good, then suddenly out at epoch 20 or something the error explodes and never goes back down, or wildly flops up and down

MSmits: everything else the same

MSmits: ah then it was ok before and still ok i guess

MSmits: I am guessing a wide range of lr's will work

MSmits: this one is a bit too low probably, because it needs too much time to finish

Wontonimo: yes, a wide range will def work. This graph looks suspiciously like the previous but better behaved.

Wontonimo: oh, i didn't notice that

Wontonimo: wow, yeah, that's long.

MSmits: thats because i made the LR 10x smaller :)

Wontonimo: if you are looking for another experiment, may i suggest your original LearningRate and a batch size of 100 or 64

MSmits: hmm ok

MSmits: 64 then

Default avatar.png colindillman: what the heck is c++ i cant under stand it

MSmits: https://imgur.com/a/NWwKK8u

MSmits: it seems really good, but i am a little worried about the loss

MSmits: it is much larger

Wontonimo: wow. interesting.

MSmits: this could be a feature of the fact that my "sample" is the entire state space maybe ?

MSmits: batching might be bad in that case?

RoboStac: are you teaching the same data in the same order each time?

MSmits: yes

MSmits: i set random seed to 0 before taking 90% as training

RoboStac: might be worth shuffling it between training steps

Wontonimo: Not the same order unless you set shuffle=False in tf.model.fit

MSmits: shuffle is set to true

RoboStac: ah, ok

MSmits: because that's the default

RoboStac: nm then

Wontonimo: so, something strange is going on here

MSmits: well I will do a 1k game test, maybe it doesn't play worse

Wontonimo: your batch size is 64 and LR is 0.001

MSmits: yes

MSmits: and i have 4520 states as data, 90% training, 10% validation

reCurse: Increasing batch size also typically means increasing learning rate

Wontonimo: ^^ right

MSmits: you mean, you have to increase learning rate for it to work?

reCurse: Not necessary to work

MSmits: no i mean to be as effective

reCurse: But your gradient is more precise so you can take a larger step

MSmits: ahh

reCurse: It's more effective wrt time

reCurse: Because 2x batch takes 2x time

reCurse: But gradient is more accurate

MSmits: maybe it just wasnt done training yet because the lr was too small

reCurse: So you can 2x LR

JakubBiskup: is it just me or multiplayer still not working? :(

reCurse: I have a picture in my mind of the article/paper I remember it from, but finding it back is always such a trouble

MSmits: yeah that solved it reCurse

Wontonimo: awesome

reCurse: It was showing a trajectory

reCurse: With more noise with smaller batch size

reCurse: It made a lot of sense, anyway

reCurse: The funny part is you actually *want* that noise in RL

reCurse: But not in SL

MSmits: to get out of local minimum

MSmits: ?

reCurse: Ok sorry, trying again

reCurse: You want more noise in RL

reCurse: You want less in SL

MSmits: I see

reCurse: Because you don't know if your data is good in RL yet

reCurse: So you want to keep some noise for exploration

reCurse: SL you're assumed to have the perfect data

MSmits: ah yeah

reCurse: 'Stationary target' is the term

MSmits: https://imgur.com/a/r4vgHYY

MSmits: this is with batch 64 and 0.01 LR

MSmits: the end result is similar to what i had with batch 1 and 0.001 LR, maybe a bit less noise

MSmits: trains much faster though :)

Wontonimo: and it seems like validation is a little high? or am I just squinting wrong?

MSmits: well validation is possibly closer to training loss

MSmits: but the absolute validation is about as good as before

MSmits: relatively it seems better because training loss is higher

MSmits: anyways, i should move onto other things. I might mess around with oware for a few days. Not coding NN, but coming up with a way to create a good supervised learning set

Wontonimo: i've mentioned before how I like when training and validation are close. It isn't a universal belief, and there are definitely those in the camp that believe that test should always be able to achieve 0% error.

MSmits: nah it makes sense that they should be close

Wontonimo: sorry, training not test

MSmits: but in the end you just want your validation loss as low as possible

Wontonimo: but you get it

Wontonimo: yeah

MSmits: the weird thing though, is that in the case of my TTT NN, that is not what makes it play better

MSmits: since the 10% validation set is just that... 10%. the 90% training set is most of the gameplaying :)

MSmits: i could just not validate at all and train on 100% :) But that would be boring

reCurse: That's part of why I was saying it's not a good testbed

MSmits: yes i got that

Wontonimo: pro tip : you have to be careful with trying to chase validation. if you keep tweaking and tweaking your design and hyperparameters to minimize validation, then you are effectively overfitting for that metric and you have all the same pitfalls of overfitting again

MSmits: but as you see, generalization is still possible. It is just meaningless in terms of this usecase

KalamariKing: WOO $10 says I failed that math test

MSmits: well maybe you shouldn't be on CG between tests :P

Wontonimo: lol

reCurse: I find it hard to tell if it's generalization or something else on such a small state space

KalamariKing: wontonimo wdym chasing validation?

MSmits: yeah, well... me too, but at least it gets better at predicting a valdiation set while training on a different one

KalamariKing: are you saying you're essentially overfitting to the validation data?

Wontonimo: yes

KalamariKing: if its not training on that data tho, how can it overfit?

reCurse: There can be hidden correlations in training and validation

Wontonimo: if you change a hyperparameter to get a better validation, you have manually-fit

reCurse: That's part of the challenge

Wontonimo: if you automate a process to find the best hyperparameter to minimize validation, you are by definition overfitting to validation

KalamariKing: interesting

KalamariKing: what if you change the validation data per batch tho?

Wontonimo: no, don't do that.

KalamariKing: why?

Wontonimo: unless you have new data

MSmits: btw, reCurse I agree that the result would have been far more meaningful if it were learning to distinguish pants from shirts on mnist data, but I get more motivated by boardgames :)

reCurse: Well it would have been easier to test the basics

reCurse: I get the motivation part

MSmits: agred

KalamariKing: Wontonimo why is changing train/val split per batch a bad idea?

Wontonimo: i think there are some better resources online about stats and train/test/validation split rational than what I could say in just a few short posts here, but in a nutshell, a test should be a test. If you study to the test, it isn't testing if you understand anymore, it is a measure if you memorize

reCurse: I couldn't quite wrap my head around the actual usefulness of test/validation

reCurse: And it seems to be uncommon practice

reCurse: I get the idea you can overfit the hyperparams but does it really matter if you have a good dataset to begin with?

KalamariKing: well for one you can detect overfitting other than that what's it good for? maybe knowing when its ready for application?

reCurse: Sorry, test vs validation

reCurse: Not test vs training, I totally get that

KalamariKing: validation is testing per epoch right

struct: Is it used for training?

KalamariKing: and/or a similar concept

reCurse: It doesn't help I often see the terms test and validation used interchangeably

Wontonimo: yeah, they are interchanged a lot and that doesn't help

jacek: so your data must be... invalid

jacek: or intest

Wontonimo: but image you've been training an NN, and using 1 test set to see how well it has been doing. During this process you

Wontonimo: change a bunch of configurations, add layers, put in some features

Wontonimo: and keep working till your test results are awesome

Wontonimo: how do you know if you haven't just overfit the test?

reCurse: But that's the part I don't get

jacek: or switch to RL. inifinite data *_*

reCurse: The minute you use a validation set, you're subject to the same overfitting

reCurse: So what's the point? What am I missing?

Wontonimo: you need a 2nd test set that you've NEVER used before. You use it only once. That's it

reCurse: So what's the point?

jacek: you use test set after every epoch and validation after all learning ?

Wontonimo: that's the point

reCurse: I don't get why you would do that

reCurse: Seems like a waste of data

Wontonimo: the 2nd one is the FINAL test, because in actuality all the previous was self-testing

reCurse: Ok but by definition you can't compare it to anything else

reCurse: So what use is it for?

Wontonimo: to see if you f*** up and overfit to your 1st test set.

struct: But wont it still overfit for the firt tests»

Wontonimo: which happens

Wontonimo: A LOT

reCurse: But you have no reference by definition

Wontonimo: what do you mean?

reCurse: So what do you do then? Try to fix it, but you can't compare to the previous result

reCurse: Because if you do you're back to square one

King_Coda: XD

King_Coda: Disney XD

Wontonimo: then you need another test set

reCurse: Who says it's even comparable then

Wontonimo: unseen

Default avatar.png ldiaks01: hey

KalamariKing: Hello again King_Coda long time no see

Wontonimo: that reCurse is a very good question, and something that plagues deep learning

reCurse: And what if your dataset is fixed, you keep a validation set and hope you don't screw up?

King_Coda: KalamariKing!

reCurse: I'm not convinced at all on the concept

LazyMammal: reCurse my 2 cents. the once-use-only validation set is prevent the researcher/operator/engineer person from fooling themselves. however much overfit, accuracy or precision NN has on dataset. the human wants to know how much is real. the validation set is that reality check. super important for research papers. still kinda important for setting up pipelines at home. don't want to live in delusion right?

Wontonimo: excellently said

King_Coda: Did LazyMammal just watch Inception or something?

LazyMammal: :D

Scarfield: at least 4 cents :p

Default avatar.png MarcFrancisEscasinas: what does int mean

reCurse: Yeah, so you screwed up, so then you have no data left to try again

King_Coda: Integer

King_Coda: Int - eger

Default avatar.png MarcFrancisEscasinas: tnx

King_Coda: :stuck_out_tongue:

Wontonimo: and if you didn't do that, then you wouldn't know if you screwed up

KalamariKing: As far as I have gathered... You use the train data to... train, and to try to get the validation data as high as possible (without training on it). Then you use the test data to make sure you didn't overfit to the val data

KalamariKing: right?

reCurse: I'd think your test set was bad to begin with if that happens

King_Coda: KalamariKing is officially smarter than me

Scarfield: not that i know, but it sounds like its just insurance against a bad test set

King_Coda: I didn't understand a single word of that

LazyMammal: confidence interval and sampling sizes still apply. the validation set should be big enough to ... validate things but no bigger. of course using as much data for train/test as possible. just reserve a tiny bit for illusion bubble popping

reCurse: And who's to say validation data isn't ill-formed either...

AntiSquid: it's true even your validation might not be enough, depends how much data you're missing . how do you decide how much data you're missing unless someone specifically tells you ?

LazyMammal: yep, if your data collection is skewed in some way and validation is drawn from that pool of data then it won't improve your data. ofc

jacek: reCurse this should be better than nothing

jacek: this way you can discard anything because you dont have ground truth

reCurse: Yeah but the minute you use it you're screwed

reCurse: You can't do anything anymore until you find a new one

reCurse: That's insane

jacek: schroedingers data

KalamariKing: I was literally asking if that explanation was correct

KalamariKing: By the way the conversation continued I assume it was right?

Wontonimo: it's not so black and white, do or die reCurse.

reCurse: But that's the distinction between validation and test

LazyMammal: "can't do anything" is limited to things you can say "this has been validated". if you don't care about that answer you never have to ask it.

reCurse: I don't know, I naively think most hyperparameters would have a hard time overfitting to data it never sees

reCurse: Then again I don't work in life-critical domains so

KalamariKing: *sigh* I'm going with correct

Scarfield: xD

Wontonimo: imagine i'm coming up with a theory of motion for heavenly bodies. I have data for 20 planets and moons. I come up with something. Susan says she has data for 5 more that we could test by theory. It doesn't match so i keep tweaking things till it matches. Is it a good theory now? Will Bob has 5 more, and we look at that and it doesn't match. What does that say of my original theory and the tweaks

KalamariKing: Isn't that the point of splitting val and test? so that even if it overfits to val, it still has new data it hasn't seen?

reCurse: Yes but I take the issue with wasting data that you can't even compare or reference

KalamariKing: ok, true, but with a large enough dataset that shouldn't matter

KalamariKing: but with a large enough dataset overfitting *shouldn't* happen

struct: what if it does

Wontonimo: tbh reCurse, I do all this to validate my model, then after it is validated I retrain with ALL THE DATA

struct: how will the hidden dataset help?

AntiSquid: reminds me of particle physics Wontonimo . always a new particle gets added and then the model needs to be rethought, what are your suggestions ?

Wontonimo: so i don't think of it as a waste of data at all.

LazyMammal: http://chat.codingame.com/pastebin/4ee7b1a9-1eb8-4480-a2db-c36d848b831d

KalamariKing: if the patterns are similar enough, can the model overfit to the train data, but also be overfit to the test data it has yet to see? or is that the definition of a good model

LazyMammal: oh, I hit some kind of word limit?

KalamariKing: I think so

KalamariKing: but if you have real-world data like the bot battles, why not go with rl

reCurse: Oh that's an interesting point LazyMammal

reCurse: If you do have a final deadline then sure

reCurse: I can see

Wontonimo: wow, i didn't think this was going to lead to so much discussion on the topic! :D

Wontonimo: cool stuff

reCurse: The only reason I still stick around hehe

Wontonimo: i though it was my dad jokes

Wontonimo: you need a mentor

reCurse: I do? I thought I made enough bad jokes already

jacek:

Wontonimo: AntiSquid, about the particle physicics, it's like the planet thing. if by adding something you need a new formula, then probably the old formula was poor to begin with

Wontonimo: or was the question rhetorical ?

AntiSquid: "then I probably want a reality check before that do-or-die moment" last minute submit reality checks LazyMammal

AntiSquid: no Wontonimo, it's a geniune question, you might find more data later ...

LazyMammal: hehehe, actually that's a good point. 11th hour contest submission has no take-backs. good idea to have confidence about overfitting. so pipeline should have internal reality check. bot arena with old versions is a good start.

reCurse: It doesn't seem like it would affect anything though

reCurse: You already have validation through the final games

Wontonimo: imo, the test is testing if the NN can digest the training data well and make sense of it properly. So if validation passes, then I have no problem retrain on all data.

reCurse: Not like the validation results would give you anything actionable

Wontonimo: validation / test is way more important with medical applications, flight simulators, things were lives are on the line, like UTTT

Wontonimo: okay, maybe not UTTT

LazyMammal: reCurse, action choice is "submit or don't submit". If I fool myself that NN is best-ever-badass and it might still fail purpose as engine-of-fighting-bot. I won't know if I only looked at NN data fit. I also need to look at bot performance. This is analogy of test vs validate but also a real scenario.

KalamariKing: Wontonimo but then can you continue training after that, with no 'new' val data?

reCurse: There is nothing to gain by not submitting though

KalamariKing: Maybe keeping your confidence high

Wontonimo: on CG, yeah, submit

reCurse: I can totally see for a deadline like research paper, critical domains, etc

reCurse: Thanks for that, missed the perspective

reCurse: Still seems like it's overkill for the majority of applications though

Wontonimo: it is

Wontonimo: it's a good habit, like unit testing

reCurse: Oh no don't get me started

LazyMammal: But you could totally lose internet points if you drop 50 ranks with a stinker submission 5 minutes before the contest closing! That's real life stakes!

Wontonimo: KalamariKing - if it is for something serious, absolutely not.

reCurse: I think the validation won't help you much there

reCurse: The arena is the best validation there is

reCurse: See the number of times people complain their local 80% wr bot sinks in the rankings :P

Wontonimo: different things - arena is validation of your bot, validation set is for validating your model which is only part of your bot. Unless you are doing RL, then your bot and model are intertwined

reCurse: Well it seems like acting on the validation in this case would lead you astray more than otherwise

reCurse: shrug

LazyMammal: welp, I can't argue with that. everyone has to choose cost/benefit for themselves. if offline arena is too hard or slow to make then don't do it. a good fallback plan is to split test/train/validate for NN data.

reCurse: Adversarial is pretty different

KalamariKing: Wontonimo I thought not, but if you have a limited dataset, how can you train more

KalamariKing: Thats the issue I have with training on val data when you're done

reCurse: Training on all data seems like a mistake to me

reCurse: Even after validation

Wontonimo: reCurse and I were talking about this yesterday was it? you can have several models, each trained on a different train/test split

Wontonimo: then combine them to vote on the answer

[kirr]: Hi guys

jacek: ensemble model

jacek: eh

Wontonimo: yeah that

reCurse: That's a different issue from validation though?

Wontonimo: "if you have limited data, how can you train more" and "training on all data seems like a mistake". I think ensemble model addresses both (so long as you ignore validation)

reCurse: Ok I didn't see ensemble like training on all data, even though it technically is

reCurse: It's more of an issue of having a training session with all data, then you have zero tools to spot overfit, and it might start overfitting where it didn't before

Wontonimo: i've only done toy ensemble models. it's never come up at work or in production for me

reCurse: If you have separate models trained on partial data like an ensemble, then yes, that won't occur, and you still have a tool to make sure each model is reasonable

reCurse: Maybe not so useful for inference, but for testing that sounds pretty good if you can afford it

Wontonimo: not on CG where you only get 100ms and you want to run 50k simulations

reCurse: Well even in general

Wontonimo: at work, taking x10 as long to increase accuracy by 1% is totally worth it.

reCurse: Oh, then how come you didn't get around to trying that? Are there better methods?

jacek: where you work at

1415495: reusing you the same test sets, remind me of http://imgs.xkcd.com/comics/significant.png

reCurse: Am I naive in thinking the hyperparameters can't overfit that much to test set?

Westicles: Volcanoes got approved

jacek: oO

Wontonimo: How many hyperparameters do you have? 2? maybe not. 120? probably

reCurse: I'd like seeing an actual example where an hyperparameter overfit the test set and bombed the validation test

1415495: reCurse: with a sufficient test sets, and the classical hyperparameters, I don't think that you can overfit the test cases (if they are actually different that the training set)

reCurse: That's my intuition yeah

Wontonimo: the number of layers, size of layers, connection of layers, preprocessing steps, preprocessing thresholds, loss factors are all hyperparameters if you change them.

1415495: https://arxiv.org/abs/1611.03530

1415495: (Understanding deep learning requires rethinking generalization)

Wontonimo: i'm not disagreeing with you reCurse, just stating what i htink when i think hyperparameters

Alshock: is it bad if I butt in and say, uh, hyperparameters? Like magic numbers, but for the magic numbers spaghetti that are NNs?

MSmits: to some (much lesser) degree, if you fit the hyperparameters to get the best result on the validation set, you're using a validationset as a training set.

reCurse: Yeah, I have a hard time thinking changing those would work well on training, test but not validation

MSmits: and i should probably call it test

reCurse: Thanks for the paper fenrir I'll check

jacek: Alshock parameters like learning rate, or the size of NN

MSmits: so I get why you would want a 3rd set in some cases... probably not applicable for us though

reCurse: Parameters are part of the model, hyperparameters are part of how they affect training the parameters.

reCurse: You could also call them metaparameters if it helps

reCurse: IMO

Alshock: tyvm, I figured it was that *tries to get away looking brilliant and knowledgeable*

Wontonimo: lol

Wontonimo: i don't like the term hyperparameter. it's just another parameter

MSmits: you could also use a GA to find the best hyperparameters. That GA will also have parameters

MSmits: hyperhyperparameters

jacek: neat

Wontonimo: your could make a NN to find the best parameters for the GA

MSmits: maybe the same NN the GA is finding hyperparamters for

1415495: about the hyperparameters: an argument that I see is: what is the entropy of your hyperparameters ?, basically if it's entropy is lower than you test cases, I don't see an issue

MSmits: entropy can only increase afaik

jacek: phycisits :unamused:

1415495: otherwise, you have a compressor that can compress down to 1 bit

MSmits: :P

Wontonimo: and my son when talking about his room

1415495: I am talking about Shannon entropy

Alshock: Hey now that's an idea

Alshock: woopsie scrolling

jrke: CG is still very slow

jacek: :scream:

1415495: and in physic: the entropy can go down spontaneously, it is just *so* improbable that it happens on near infinite timescale

jacek: jrke found out the issue with the vector?

MSmits: sounds theoretical to me

jrke: i finally fixed the vector issue

MSmits: like a hypothesis

jacek: what was it

Wontonimo: what was the vector issue?

jrke: the issue was like i push data having .x =- 1 and i further process that in map[data.x][data.y]

jrke: just not pushing that -1 worked

jrke: and made in top silver without any mine

Alshock: map[INTMAX] may not have been your best access ever xD

jrke: my rank 108 in silver but my currrent submission ened in top 15 or 10 in silver cause rank is not getting updated now but CG enhancer says that last battles are from top 15

Alshock: try to bring back the random memory check when you battle against the top 15

1415495: and for overfiting: a nice blog post about GPT-2 and private information leaks: https://bair.berkeley.edu/blog/2020/12/20/lmmem/

Wontonimo: fenrir - from the book "magic of reality" , it is possible that if you put all the parts of a car in a large bag and shook it the car would assemble itself, but it is highly improbably.

1415495: that's the same thing yes

Alshock: The future is adding an output to your NN that is estimating its own level of overfitting

MSmits: ah but that only works if the bag with the car parts is a completely closed system.

MSmits: otherwise there's probably some outside interaction, increasing entropy elsewhere

Alshock: You mean it's actually much more probable than estimated?

MSmits: entropy is mostly a statistical notion though

Alshock: Gotta buy a car and break it down to pieces right now

Alshock: does anyone have a big big bag?

reCurse: Funny fenrir, reminds me of automaton's regurgitating exact sentences sometimes

jacek: Automaton2000 or AutomatonNN

Automaton2000: but you could also use it to improve your skills

Alshock: RIP AutomatonNN

1415495: yes, might be the same king of issue

Alshock: @Automaton2000 What if I have no skills whatsoever?

Automaton2000: was thinking of something else

Alshock: aw ok... :disappointed:

1415495: another interesting paper is https://arxiv.org/abs/1803.03635 (The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks)

reCurse: Yeah I remember that one, still have no idea what to make of it :)

1415495: basically, most of the weight are useless (if not overfitted), but you nee them to be able to train

1415495: (I understood it like that)

AntiSquid: how much did you read fenrir ? papers overall

reCurse: I think it's the driving idea behind teacher/student networks if I'm not mistaken

reCurse: Distillation and all

1415495: a lot, while doing my NN on CSB

1415495: and then I find the domain interesting

reCurse: I find it more applicable when seen as such anyway

Wontonimo: that paper about lottery ticket hypothesis is great.

Wontonimo: it's a fantastic lesson on : larger networks train faster, but once trained they can be smaller.

Wontonimo: once smaller then can't learn much anymore

1415495: reCurse: for distillation yes, but I also remember that training with the teacher output is difficult, so they make them 'softer' at the begining

Wontonimo: so so similar to alzheimer's

1415495: mmh, didn't saw that point, but making me think now

AntiSquid: regarding the example you gave where GPT memorizes personal info, i mean it's accurate real data, it retained information, great, any idea of related topics / papers for this example on how to make the model understand better what values are likely to change at some point ?

Wontonimo: as we lose neural connections we retain most of what we learnt but lose capacity to learn

1415495: AntiSquid: I don't think I saw something about that, no

reCurse: I don't know if it's train faster, or you need a lot more 'space' to learn before you can focus on what matters

reCurse: Like a puzzle

reCurse: You have some little clusters all over the place = more space

reCurse: Then you start assembling = less space

reCurse: If you can't store all the small hints you don't make progress

reCurse: Anyway, maybe just rambling

Wontonimo: yeah, i haven't seen anything either AntiSquid, other than maybe knowledge graph architectures. I don't have links but the idea is that the network only learns how to traverse a knowledge graph and not the data in it.

Wontonimo: you can then change the data

Alshock: we're all just ramblnig anyway, I don't like calling it "space" because in my opinion it's more like actual data. A bit like a "what else could I try?" data

AntiSquid: right Wonto .

Wontonimo: hey, reCurse , here is a great video by the late Professor Winston about larger vs smaller nets https://youtu.be/VrMHA3yX_QI?t=2344 , i've bookmarked the time in the video where he starts to dive into this specific topic

1415495: one big issue related to that is: how do you check against hidden 'trap door'

reCurse: Cool thanks

Wontonimo: he shows that with a small enough network, it can't learn the problem, but it CAN remember if it learnt when it was larger

reCurse: I'll have to try teacher/student at some point

reCurse: Does bother me a lot that a big part of the NN doesn't seem so useful

reCurse: After training

1415495: for RL: becarefull: you must train it on point explored by the student

reCurse: Makes sense

Wontonimo: AntiSquid but i haven't seen anything groudbreaking about about knowledge trees lately

1415495: otherwise, it may have a small loss, but will go where untrained very easily

jacek: teacher/student? so teacher NN labels data and student learns on labelled data?

reCurse: Yeah I still fail to have a reasoning over why it's important to learn near on-policy

reCurse: But that's how it works in practice

1415495: because with a policy, you are accumulating drift over the trajectory

1415495: and if you go where untrained, you have random data

reCurse: jacek: yeah

reCurse: Oh, right.

reCurse: Should have been obvious

jrke: reCurs e so you designed something for training OOC NN

jrke: ?

reCurse: I didn't start anything

reCurse: Just musing

jrke: damn i always forgot question mark after questions

jacek: why

jrke: oh so ran updated

jrke: now 8th in silver

jrke: rank*

reCurse: fenrir: Though I think the part where I had trouble with, since the trained doesn't have much data about those states anymore, shouldn't it be the same?

reCurse: Same issue on both networks

jrke: btw is that me or everyone - does CG enhancer brakes the outfit of ide?

reCurse: So why would training "off-policy" be so much worse than sampling what the trained network knows best

reCurse: I mean, 99% of what it explores

reCurse: Because the latest epochs will be concentrated on what the trajectory it typically takes anyway, so that's where the learning occurs

reCurse: So in theory if you cover the same space on a new network, it should work (but doesn't)

reCurse: Unless some parts of the trained networks survive through epochs of training somehow

1415495: unless the student take exactly the same action that the teacher, you will enter some place not teached, if the student take corrective action and go back where it was trained, no issue, but otherwise it will go somewhere untrained, where anything that it output will be random

Paul_Demanze: does anyone know what is the max amount of code

jrke: 100k chars

reCurse: Yeah my question is the teacher does not go there anymore, so chances are it lost that information as well?

Paul_Demanze: ok thx

reCurse: Oh I think I see

1415495: I mean: if you play the teacher: it won't explore all the possible states

1415495: and so the student won't have seen the state where it want to go

1415495: and won't have learner anything for them

reCurse: Yeah I'm seeing it as a problem of distribution of data now

reCurse: That makes a lot of sense

1415495: so as soon as the student diverge from the teacher, you enter uncharted territory

1415495: (for the student)

reCurse: The student needs to focus on its mistakes but doesn't have nearly enough data to do so

reCurse: I'm happy with that now

reCurse: Thanks

1415495: the teacher may know how to recover, but won't have shown it to the student

1415495: cool

reCurse: Some form of importance sampling maybe :)

1415495: I see it as: due to small inaccuracy in the learn actions the student may end up somewhere where it has never been (or not often enough) teach what to do

1415495: making it do more mistake, that accumulate

reCurse: Yeah it's in the same area, I just have an easier time framing it as "it doesn't see its mistake enough to correct it"

1415495: (but if the student has perfectly generalized what it should be doing, there shouldn't be an issue, but ... ;))

1415495: it's probably the same issue with RL with off policy data which seems damn hard

1415495: (I mean truly off policy, not just 'near offpolicy'

reCurse: Definitely seems like a probability mismatch to me yeah

reCurse: Though off-policy and on-policy seem to be ill-defined as well

reCurse: In papers I read anyway

1415495: I have seen some success in a paper where that had to trained multiple model and then average them

reCurse: Oh SWA?

reCurse: That seems like a common technique in general

1415495: ah no

1415495: I think it was a paper on deepmind

1415495: (but they seems to be multiple about that issue)

1415495: https://arxiv.org/abs/1907.04543v3

reCurse: Hmm ok

MSmits: someone who knows nothing about ML is going to find this conversation about students and teachers very strange

reCurse: Huh?

1415495: the key point seems to be a lot and lot of data, and models ensembling

reCurse: Wait

reCurse: I remember a paper saying the exact opposite am I hallucinating

1415495: ah the wonder of research ;)

reCurse: Arguing that Q-learning was not really off policy

1415495: it depends on what king of data you train your q function

jacek: mandela effect eh

reCurse: Because models trained on their own experience replay outperformed those that trained on foreign ones

reCurse: Or something like that...

1415495: on policy basically means using sample from running your policy (and where your NN has not yet changed)

reCurse: Well off-policy vs on-policy is a 'gradient'

1415495: the more it has changed, the more off policy

reCurse: You're still closer to on-policy if it's the model that generated the data with epsilon greedy

1415495: (ie the policy used to sample versus the policy to act)

reCurse: Than if it's a model completely unrelated

1415495: yes

reCurse: Trying to find that paper now...

reCurse: Got it I think

reCurse: https://arxiv.org/abs/1812.02900

1415495: I think they are at least partially compatible

1415495: in the other paper they average multiple model while learning, which will decrease a lot the overestimation (as the models have uncorrelated errors)

1415495: is it enough, no idea

reCurse: Puzzling

1415495: they also use 50*10^6 data points per game

1415495: they also seems to think in the same terms than you: "Offline RL is considered challenging due to the distribu- tion mismatch between the current policy and the offline policy'

1415495: mmh truncated, the remaining: "data collection policy, i.e., when the policy being learned takes a different action than the data collection policy, we don’t know the reward it would have gotten."

reCurse: Well most likely me thinking like them from subconscious reading ;)

1415495: :)

Lucky30_: :thinking:

aCat: Is there a problem with server?

aCat: I got nondeterministic results of a submit of puzzle

aCat: and often "Failure Process has timed out. "

aCat: for a simple bunch of ifs

aCat: Ah, that's a grrovy problem

BlaiseEbuth: Yeah groovy is a problem...

aCat: found fix to use java within

aCat: made my day

aCat: as the entire goal was to make groovy language achieements

aCat: and now I got it free

aCat: :D

BlaiseEbuth: Same for me. How do you use java in it ?

Default avatar.png Joe211: sometimes I feel like the compiler is the problem

Default avatar.png Joe211: but im just retarded

aCat: BlaiseEbuth https://www.codingame.com/forum/t/groovy-gets-timeouts-in-every-clash-now/190579/5

BlaiseEbuth: Oh cool. Thx aCat :thumbsup:

aCat: :-)

jacek: nice rank robo, im happy for you :angry:

jacek: :upside_down:

RoboStac: thanks :)

jacek: the nn word?

RoboStac: yeah, realised I was doing something stupid

jacek: like wasting time for this game

Default avatar.png parthi_v: can anybody explain how to solve

jacek: solve what?

Default avatar.png parthi_v: i m new to this can u tell me how to do


ddreams: how to do what?

ddreams: the art and science of asking questions

Default avatar.png parthi_v: chess game

ddreams: It's impossible to help you if you don't provide more information

ddreams: Are you able to do any puzzles?

Default avatar.png parthi_v: am learning python

Default avatar.png parthi_v: i came here to solve some puzzle

ddreams: which chess game?

Default avatar.png parthi_v: https://www.codingame.com/ide/puzzle/chess-board-analyzer

ddreams: how far have you come?

Default avatar.png parthi_v: just now i have entered

ddreams: well, then the next step is to think about how to solve it

ddreams: any ideas?

Default avatar.png parthi_v: no idea sorry

ddreams: then I suggest you pick another puzzle and come back to this one with more experience

ddreams: it says it's a hard puzzle

aCat: start with some easy ones

aCat: and maybe ithut user names attached - they are original CG puzzles

aCat: better for beginners

aCat: without user names

ddreams: https://www.codingame.com/training/easy

Default avatar.png parthi_v: ok fine let me try another one

CodeLoverboy: ooooooooooooofffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff

CodeLoverboy: oof

CodeLoverboy: oof

Greg_3141: In Rust, how can I make a bot return the best move it's found so far instead of timing out?

jacek: how do you do it in other languages

Greg_3141: idk i've never done it before

struct: you need to time it

struct: and return it before times runs out

jacek: so the question is more general than "in rust"

jacek: what kind of search you do? minimax?

Greg_3141: yes

jacek: then use iterative deepening

Greg_3141: ok

Gumarkamole: kotlin compiler is soooo slow

Gumarkamole: i am waiting for tests longer than writing my code

ZarthaxX: maybe your code is slow tho

Default avatar.png nuggetbucket54: i'm so trash

ZarthaxX: :(

ZarthaxX: nice self steem

ZarthaxX: esteeem*

itzluku: coc rank update in 5min? or when

ZarthaxX: for your global rank?

ZarthaxX: it's in 3 hours

itzluku: ah ok ty

NomNick: hello, can we use random method in codingame environments ?

NomNick: like golang rand pkg ? https://golang.org/pkg/math/rand/

Greg_3141: Why would you not be able to?

NomNick: because I dont succed at seeding it :(

NomNick: so the better question is how to seedrand on codingame

ZarthaxX: like in any other lang NomNick

ZarthaxX: codingame doesnt work differently

BruteForceLoop: Some of these easy algorithm problems are pretty hard, I tried to do the 1D spreadsheet problem for 3 days and failed, or maybe I'm really bad at algorithms.

ZarthaxX: you will get better with time prob and eventually solve them faster

ZarthaxX: also sometimes there exists that one puzzle that is hard for us

BruteForceLoop: ZarthaxX Thank you for the advice. It's demotivating being stuck at a problem for 3 days, I was having nightmares about the problem.

struct: NomNick I think it works

struct: I just copied snippet from the web and it worked

struct: https://tech.io/snippet/gC7Vrvo

NomNick: hello i tried a default code with some rand testing and it seems to work too

NomNick: so no surprise here, the issue is between the chair and the laptop

NomNick: what's tech.io and how is it related to codingame ?

struct: tech io is from codingame

NomNick: was reading the about us

NomNick: they spin off their tech stack ?

struct: what does that mean?

NomNick: is it a business venture ? trying to sell codingame tech to university for example ?

NomNick: to provide teachers with tools to enhance their materials & classes ?

struct: I dont think so

struct: its free

NomNick: nice of CG

NomNick: ah. I see my mistake at least

NomNick: of course, fucking Golang

NomNick: for i := range rand.Perm(N) will give you the index. I still don't get how the golang designers though it was clever to put the index before the value in range outputs

ddreams: key/value perhaps

Decco: yooo

Decco: my ppls

Decco: i love you guys so muuuuuuch

Default avatar.png HapppyDe: yo technoblade

Decco: gaymer

SuperArtyK: gg

KiwiTae: scary doll following me aaaah ~~

KiwiTae: good morning o/

7-4: hi mates

KiwiTae: sup~

7-4: ay

Stormalix: ayyyy

7-4: how is it going

Stormalix: good lol wbu

7-4: bad

Stormalix: aw whyy

Smelty: sad 7-4 nosie

7-4: wtf

Default avatar.png SickNerd: g