Chat:World/2021-06-01

From CG community
Revision as of 12:08, 15 June 2021 by Chat Log (talk | contribs) (Created page with "<img src=/a/65312149065218> memcorrupt: https://escape.codingame.com/?fromToken=C12-6FP-itS-dKq <img src=/a/65312149065218> memcorrupt: im at #2 <img src=/a/65312149065218>...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

memcorrupt: https://escape.codingame.com/?fromToken=C12-6FP-itS-dKq

memcorrupt: im at #2

memcorrupt: plz use my link :pray:

eulerscheZahl: no, dbdr is at #2 I dropped :(

Chainman: I'm like #615

jrke: dbdr cllimming a lot

Chainman: Wait or are you talking about rank?

eulerscheZahl: he plays the community games that slowly get more players

eulerscheZahl: https://www.codingame.com/leaderboards/general/global

jrke: dbdr can be first in few days

jrke: you yourself can try that euler

jrke: playing community games

eulerscheZahl: but i'm lazy and cherry picking

jrke: some are having good investment

eulerscheZahl: when I have 2 options: community multiplayer and contest on another website

eulerscheZahl: what do you think i would do?

jrke: contest

jrke: not just you everyone

eulerscheZahl: correct. codeforces ended 2 days ago

eulerscheZahl: i might try topcoder which starts tomorrow, not decided yet

jrke: i only code on CG and sometimes rarely puzzles on hackerrank

eulerscheZahl: i stopped hackerrank when hackerrank stopped contests

eulerscheZahl: i liked their 2 days long code sprints

jrke: is there any other website having contest like CG?

eulerscheZahl: not currently running

eulerscheZahl: this one *might* be interesting, i don't know myself yet https://www.lux-ai.org/

Default avatar.png TranTuan1: many web like CG

Default avatar.png TranTuan1: such as https://codeforces[.]com/

eulerscheZahl: codeforces is exactly what CG is not like

eulerscheZahl: here on CG we have bot programming contests lasting 11 days, codeforces has short rounds (2-3h) with several problems that have a clearly correct answre

eulerscheZahl: there are occasional marathons with optimization problems (like the Huawei contest last week) but no multiplayer games and no visuals

eulerscheZahl: also CG works on problems with a much smaller scale (like map size) just for visual reasons alone

eulerscheZahl: the Huawei contest had a graph with up to 5000 nodes, you won't find anything like that in a CG contest

jrke: yeah

Default avatar.png Kevin24: robot

derjack: good morning

BlaiseEbuth: ¡ʞɔɐɾɹǝp ʎǝH

Default avatar.png JumpJump: hello

zapakh: :upside_down:

Default avatar.png JumpJump: where are you from

derjack: apparently blaise and zapakh are from australia :thinking:

Default avatar.png kar1m:

test

Default avatar.png Federoll: no <script>window.location = "https://teletubbies.com"; </script>

cegprakash: hi anyone good with ruby on rails? I've some beginner noob stupid doubts

cegprakash: 1) why do some variables have colon in front and some do not have

cegprakash: 2) what does a line with just a variable before end of function do? Do they return it?

KiwiTae: cegprakash :x is a symbol

KiwiTae: :smth

cegprakash: colon is a symbol yes but what do they do?

KiwiTae: its bit like & in cpp it maps to a memory address

KiwiTae: or not lol actually dont know much ruby

cegprakash: http://chat.codingame.com/pastebin/005fbda7-79a1-4799-a985-b52569556dfa

KiwiTae: just know its usefull :hehe

KiwiTae: first one is hash

KiwiTae: nd if 2nd doenst work its prob because its not ruby syntax?

cegprakash: there is a funciton in our production code which looks like

cegprakash: def fn

a.b

end

cegprakash: what does this function even do :D

BlaiseEbuth: -> remove it and see what's not working.

BlaiseEbuth: Pro tip

cegprakash: :joy:

cegprakash: it's a big code base

cegprakash: I don't even know what triggers which function

cegprakash: just trying to understand the syntax first

cegprakash: I built a sample working application on rails and even wrote a blog but still I don't know basics :D

derjack: i think you need to learn ruby syntax before trying rails

cegprakash: A ruby symbol is like an Enum constant in Java or C++ like:


cegprakash: now it makes more sense

Default avatar.png Federoll: "shortest possible solution" challenges arent even fair lmao python guy produces a 80 char solution when I do it with 750 characters in c#

BlaiseEbuth: You can learn python.

derjack: use python for shortest then :shrug_tone1:

cegprakash: haha I can't choose sadly.. I already know python. This is for work

cegprakash: I thought it was for me derjack

derjack: not this one ~

ash_rick: https://www.codingame.com/clashofcode/clash/1787370f42ab48a58bc55a47794dc80c1aa5606

BlaiseEbuth: #clash ash_rick

Default avatar.png 1rre: > You can learn python this is a bad solution though, you could either use something like scale factors: http://cloc.sourceforge.net/#scale_factors or you could remove the boilerplate/execute other languages in the repl

Default avatar.png mathboy: hi

KiwiTae: Federoll im pretty sure u can do better than 750chr in c#

Chainman: hey

Xeno_1221: hey

Chainman: I'm so confused about how to propagate up winrate for mcts :(

memcorrupt: im #2 on the list to get access to this. please use my link so i can get priority :pray: https://escape.codingame.com/?fromToken=C12-6FP-itS-dKq

derjack: Chainman why

darkhorse64: you don't backpropagate a winrate but a result

BlaiseEbuth: How much do you pay memcorrupt?

memcorrupt: ???

darkhorse64: if result is a win for white, when the node is black turn, count it as a loss

darkhorse64: memcorrupt: it's a joke

BlaiseEbuth: You want us to click your link. You get an advantage from this. What do we get?

Chainman: I simulate for computer, and if computer wins I increment winrate for nodes that are his turn

Chainman: wait huh?

Chainman: don't backpropagate winrate but a result?

Chainman: sorry yeah

Chainman: I word that wrong

Chainman: I backpropagate result that updates winrate in the nodes

darkhorse64: to be accurate, you increment/ decrement the score on each node taken by your search and you increment the number of visits. That gives you a winrate

Chainman: at each node, you have the result

darkhorse64: yes

Chainman: and from my understanding you increment winrate if the result is positive for that player

Chainman: or you decrement

darkhorse64: yes

Chainman: must be something else wrong

derjack: oO

MSmits: yo

derjack: :scream:

MSmits: :grin:

MSmits: I improved my endgame book generator with a factor of 10. It went from 1 to 25 seeds over night

MSmits: think i can get to 36 if i take a few weeks

MSmits: after that I am memory limited (and it would go beyond 120 GB disk )

derjack: time to get more RAM then

jrke: just a doubt in MCTS we simulate till we get defined or fixed state

jrke: so that fixed state can be X depth

jrke: ?

MSmits: there's several ways to do that jrke

MSmits: you can play out till game is finished

derjack: in normal MCTS simulate until end of game

MSmits: you can also playout to depth x

MSmits: and evaluate

MSmits: or you can choose not to playout at all and just evaluate after expansion

jrke: so what do we call that rollout?

MSmits: you need a good evaluation function. That's the most important thing

MSmits: early playout termination

MSmits: MCTS-EPT

MSmits: I use the "no rollout" version

MSmits: in onitama and oware

MSmits: and othello

derjack: EPT - early playout termination

derjack: ahh you wrote it

MSmits: it's ok, repetition reinforces

derjack: and overfits

MSmits: sure

MSmits: I'm still thinking about how to use meta mcts for supervised learning

MSmits: seems only workable for nearly solvable games

MSmits: or you wont get enough depth

MSmits: oware will work I think

MSmits: Can do bandas and onitama as well

MSmits: but breakthrough is doubtful

derjack: breakthrough has big branching factor and rather long term profit openings

MSmits: yeah

jrke: clobber?

MSmits: onitama i can just generate random starts, solve them and then learn them

MSmits: same with bandas

MSmits: dunno about clobber

MSmits: seems impossible to even run a meta mcts :P

MSmits: I did it for a while, all moves 50% WR :P

jrke: :smiley:

jrke: oh C4 is the POTW

MSmits: yeah I dont think i would try to train a NN for that

MSmits: too small of an improvement over my current bot I think

derjack: youd need convnet for that

MSmits: well it would sure help

MSmits: but you do these games without it

MSmits: like breakthrough

jrke: i got this yesterday - https://youtu.be/PJl4iabBEz0

derjack: MLP NNs are quite crap for connection games

derjack: i was thinking about implementing c4 in python and use the libs for convnet

MSmits: or wait till Marchete does the convnet

MSmits: well you'd still do python localy

Default avatar.png Spiderman2222: what

Default avatar.png Spiderman2222: what u mean

MSmits: keras is sooooo easy to use

MSmits: I appreciate it because I used the handmade simple MLP thingy first. With Keras that reduced to 5 lines

MSmits: training was faster, but prediction was slower

MSmits: (singular prediction)

MSmits: probably need to use tensor input as opposed to numpy

Default avatar.png Spiderman2222: OMG

derjack: hm?

jrke: i can't install pytorch in windows machine any suggestions?

BlaiseEbuth: use linux

derjack: wsl?

Chainman: wsl2

Chainman: omg, I can't find what is wrong with this mcts, I feel i'd need to draw out the graph and states lol

Chainman: I need some graphical representation beyond just printing

derjack: and whats wrong with it? is not winning?

1457162: What is c4?

BlaiseEbuth: An explosive

DomiKo: Connect 4 too

BlaiseEbuth: A cell on many game boards

Chainman: yes it is losing

Chainman: It is very bad

Chainman: It's weird the winrate is so negative for nodes that represent the bot

Chainman: I tested

Chainman: and in all of it's rollouts over an entire game it finds player0 wins 391943, and player1 wins 3324

Chainman: And this is what always happens

Chainman: it finds a disproportionate number of wins for player0 always

Chainman: That must be wrong, I just need to find what is causing that.

Uljahn: :duck:

BlaiseEbuth: :shotgun:

Chainman: Is it possible to have too large of rollout?

DomiKo: bigger is better

Chainman: I think I found a problem maybe

Chainman: it had a point where it had two possible moves

Chainman: one of them would be win, and other would lead to tie

Chainman: but, some reason the winrate is -1 and 0

Chainman: it should be 1 and 0

derjack: if you do rollout, if its a win, do you do the correct player's win

Chainman: I can't understand it, but yeah it's doing something wrong at the winning stage

derjack: wrong sign

derjack: you could also have over 2 billions rollouts and then get int overflow

Chainman: nah it's in python rn

Chainman: only 100k iterations

Chainman: It might be wrong sign

Chainman: if I reverse the signs the bot is worse

Chainman: The thing is it is hard for me to beat the bot some reason

Chainman: to win

Chainman: but when I let him win, he always goes for tie

Chainman: He just wants to tie with me lol

Uljahn: Automaton2000: look, i can't debug my own code lol

Automaton2000: what do you think of a better algo

MSmits: Chainman these are very common mcts issues. The best thing you can do is follow every step of a rollout and manually check it

Chainman: I think I fixed it

Chainman: It is actually making winning moves now

MSmits: good

MSmits: which game is this?

Chainman: I was accidently expanding and simulating beyond terminal states

MSmits: oh right c4

Chainman: that is why I was seeing the negative winrate and an offset in winrate at obvious terminal states.

MSmits: ahh ok

MSmits: you can improve your c4 bot by avoiding losing moves in the rollout

MSmits: like... dont play a move if it gives your opponent a winning move next turen

MSmits: turn

MSmits: but only do that if you're sure you've fixed

derjack: bitboard?

MSmits: smart rollout is probably more important than bitboard, but it's much easier to make a smart rollout with a bitboard

MSmits: well for me anyways, you need some bitboard skills

MSmits: my rollout doesnt ever produce 4 in a row

MSmits: it terminates when there is no move available that doesnt give your opponent a winning move

derjack: so smart

MSmits: darkhorse does this too... works nicely

MSmits: i tried smart stuff like this for uttt, but it never yields much improvements

MSmits: derjack is there a special reason your pony is upside down? Are you in Australia when you're on this account?

Default avatar.png punter147: http://chat.codingame.com/pastebin/bd3f3fd1-7ec3-48a0-8a48-c4cf9e2fc63c

BlaiseEbuth: "when you're on this account"?

MSmits: it's a test account belonging to a well known player

derjack: yeah. i like alps

MSmits: punter147 is this regular mars lander?

MSmits: marslander 1?

MSmits: if so... then jeez terrible overkill with GA

Default avatar.png punter147: its level 2 of the mars lander.

derjack: oh

MSmits: ohh ok. I solved it with if else

MSmits: it's really hard though, took me 2-3 days

BlaiseEbuth: The pony isn't really the pony? :scream: Who's he ?

MSmits: pony will PM you if he wants you to know

BlaiseEbuth: Why so secret...

Default avatar.png punter147: wow really? yes it would be very difficult with if else atleast for me

MSmits: hey i dont give away other people's identity.

MSmits: punter147 I was new on CG then, 3 yrs ago

MSmits: otherwise i would have tried GA maybe

MSmits: many have

MSmits: you can probably find out a lot about the marslander ga on google

MSmits: i think people wrote guides even

BlaiseEbuth: It's jacek.

Default avatar.png punter147: oh nice suggestion, i will check all the guides available thanks a lot MSmits

MSmits: hi jacek, i didnt know BlaiseEbuth was also your alt

BlaiseEbuth: Devil is inside everyone.

MSmits: is this your religious alt jacek?

MSmits: oh it's your avatar

darkhorse64: re c4 bitboarding is so much faster than grid based sim that it makes a huge difference in the leaderboard

Chainman: Why are so many people doing c4 right now?

derjack: if youre jacek, whos your waifu?

MSmits: Chainman puzzle of the week

Chainman: oh, cool thanks

MSmits: btw, best opening move is 2nd from left or 2nd from right

MSmits: it's the only balanced move apparently

jrke: 4th from middle :stuck_out_tongue:

MSmits: (gives 50% wr for p1 and for p2)

MSmits: 4th from middle is the strongest move, which is why you dont play i t :)

darkhorse64: If you play 4th from middle, I'll steal it

MSmits: it has 79% WR in my meta mcts currently, after 25 million games

MSmits: move 3 and 5 are close to 70%

MSmits: move 0 is a little above 30%

MSmits: so 30-50-70-70-80-70-70-50-30

MSmits: roughly

MSmits: wait no

MSmits: 30-50-70-80-70-80-70-50-30

jrke: 80 isn't that too high

MSmits: well, the thing is, once a meta mcts is starting to solve, the wr will only go up

MSmits: get nearer and nearer to 100%

MSmits: because the best counters get solved first

MSmits: and only worse moves remain

MSmits: (for p2)

Chainman: meta-mcts means mcts is simulating with mcts?

Chainman: not random playouts right?

MSmits: it's a mcts where you do everything you normally do, except that a random playout is a full game

MSmits: like a full game as played on CG

MSmits: i did 25 million of those

MSmits: so the results are more accurate than they normally are

Chainman: Oh you are just playing till end of the game in simulations.

MSmits: the simulation finishes the game with the same calc time you get in the real game on CG

MSmits: so they are much slower

MSmits: as the tree gets deeper, your sims become shorter though

MSmits: most of my games are like 5 moves, then they solve

MSmits: so 5*50 ms, so 4 games per second

MSmits: but i use 10 processes, so 40 games per second

derjack: but we have 100ms per move :?

MSmits: yes, but my cpu is twice as fast

MSmits: i try to keep it the samwe

MSmits: same

Chainman: lol dang

MSmits: currently not running it though, busy with oware :)

derjack: im curious if training with actual scores will be better

MSmits: me too, the hard part is the space between 48 and 36 seeds where i dont have a book

MSmits: i can easily run the meta mcts, but i need some way to select move that explores properly, yet does not run into nodes where i have too few games

MSmits: select for training i mean

MSmits: selecting just random high visit nodes might be bad

MSmits: I suppose you run into the same issue with azero type training, how much do you explore?

derjack: i use softmax for final move selection

derjack: temp 1 with first few moves, then temp 3 to 8 depending on game for rest

MSmits: during training you mean, right?

derjack: yes

MSmits: or also during real games?

derjack: no

derjack: oh, also during training i multiply by random [0.75;1.25] eval during selection while in final games its 0.9-1.1

MSmits: ah right

MSmits: I hope you get convnet working some time

MSmits: because when you do, we all get a chance to learn from it

MSmits: it's really quite nice that you share so much

derjack: :blush:

derjack: oops, temp 1/3. to 1/8. during rest moves

derjack: lower temp, more exploitation

MSmits: ahh that makes more sense

MSmits: anyways, train arriving, ttyl!

derjack: during training

DomiKo: Hi MSmits how NN is going?

derjack: he's on train. probably for training

KalamariKing: He's training on a train

BlaiseEbuth: https://www.youtube.com/watch?v=hHkKJfcBXcw

Wontonimo: I watched that all the way through

Wontonimo: banger

BlaiseEbuth: :3

MSmits: i was on train, home now :P

Default avatar.png Xascoria: it do the do the does

MSmits: DomiKo I am doing a lot of preparatory work for oware NN

DomiKo: yeah there is a lot to do

MSmits: first generating endgame book for value labels. Then adjusting meta mcts to use the book to label seed states 37-48, then training NN on python/keras, then a working c++ bot with inferrer

DomiKo: any reason for keras?

DomiKo: or just first pick?

MSmits: keras is all i know really and it's apparently easy and popular

MSmits: I used it for TTT

MSmits: (as in: normal TTT)

Wontonimo: keras is in Tensorflow and also used to be a stand alone framework. I think you are using it within tensorflow, is that right?

MSmits: yes

Wontonimo: it sure simplifies a lot without giving up much

Uljahn: the only alternative i can think of is pytorch but they are quite different

DomiKo: yeah

MSmits: I'm sure there are differences, but I doubt I'll ever get to the point where they start to matter

MSmits: as long as it does what it is supposed to do and i can figure out how stuff works, it's all ok

Wontonimo: i've not used pytorch. How are they quite different?

Wontonimo: i'm assuming pytorch (like TF) is all just matrix operations under the hood

DomiKo: Keras is slower

DomiKo: And in pytorch you can do more complex stuff

Wontonimo: like what complex stuff?

derjack: complex numbers :v

Wontonimo: that's just imaginary :P

DomiKo: keras is like high level API

MSmits: can't you do everything you want with TF while using Keras

MSmits: just at a lower level API

DomiKo: I guess you can

RoboStac: yeah, but it's all tensorflow underneath so you can always drop down to that if you need the complexity

Wontonimo: but the framework is there for writing your own. it's not like you are locked into the high level. you can go as low level as you want in tf

MSmits: ah right RoboStac, that's what I meant

MSmits: so it's like getting eased into TF

RoboStac: pytorch is possibly a bit more in the middle - it's not as high level as keras but easier to work with than tf

DomiKo: but they we are comparing TF and pytorch and no't keras

RoboStac: at least from my tests

Wontonimo: yes, i was thinking TF vs PyTorch

MSmits: well keras is part of TF now right, so you cant really separate them

DomiKo: then I would say that TF is awful

DomiKo: it's really hard to read

Uljahn: with keras you can switch backend to theano ot cntk i guess

derjack: or you make your own NN framework [solved]

MSmits: I found it really easy to use in my first attempt

Wontonimo: yeah, better to just redo the efforts of a 1000 people over several years on your own. Definitely will work out well. I joke, but the attempt is a great learning experience if done strategically

Wontonimo: which i guess is the whole point of NN from scratch

derjack: nah, numpy is too high

MSmits: I liked adapting the xor example (without numpy) to play TTT

MSmits: but i would not go for more complex things with my own framework

MSmits: too much work for little gain

MSmits: when i converted it to use Keras it was sooo much nicer

MSmits: both worked though

Wontonimo: after everyone goes all NN on CG, what's next? Quantum Computing? Are we all going to have to learn about qbits in the next 10 years.

derjack: welp https://en.wikipedia.org/wiki/Quantum_neural_network

MSmits: mmh dont think i can leave my book generator running overnight. Seems to eat 1 GB / 4 min

MSmits: because it makes backups

BlaiseEbuth: nom nom nom

derjack: backups? you dont believe in yourself?

struct: every 4 minutes seems a bit extreme

emh: any interesting articles on tech.io lately?

MSmits: struct, i do it every iteration, but with higher seed counts an iteration can take hours. I only working on 26 seeds now

MSmits: 36 seeds may take days, I don't knoe

MSmits: know

struct: damn

MSmits: http://chat.codingame.com/pastebin/1278a524-72c2-41c7-9bae-657baf3dbe51

MSmits: so it found book 25 and continued from there, to do 26

MSmits: it keeps going until all states give the same answer between iterations

MSmits: (meaning more turns dont do anything)

MSmits: basically I assume a turn limit of 1 turn at the start and keep stretching the game until the end result for each state is the same

MSmits: (this is a form of retrograde analysis)

Marchete: ok Im pretty stupid

Wontonimo: book generation is something I haven't done any of

Marchete: float randNoise = rnd.NextFloat(1.0f - conf.simpleRandomRange, 1.0f + conf.simpleRandomRange);

reCurse: Don't

Marchete: damn me

Wontonimo: don't be pretty stupid, or don't do book generation reCurse?

Marchete: I wish this can be avoided

reCurse: Why not both

Marchete: there is no cure for stupidity, source: me and 99.9% of humankind

Wontonimo: what are your thoughts on not book generation reCurse?

Wontonimo: other than just don't

derjack: he has trauma from opening books

emh: are there closing books?

Wontonimo: I'd hate to see your bookself if you haven't discovered closing books

emh: hahaha

Wontonimo: i think MSmit was just working on closing book

emh: I see

emh: I want to work on Smash the Code but I don't have the energy

Marchete: with the meta MCTS he has

Default avatar.png dungdsadsa: Hi every body

CHT-DAT: HiHH

Default avatar.png dungdsadsa: hi cac

Marchete: RoboStac you there? do you know if invalid moves has negative effects on softmax's crossentropy?

Marchete: in Policy part of A0

reCurse: Wontonimo: It kills competition in fixed start games

Marchete: I did some custom loss to ignore losses on invalid moves

Marchete: but I'm not sure...

Wontonimo: imho, that sounds fine Marchete

MSmits: yeah i am just working on endgame book now and will use meta mcts to deal with the early part of the game. To use as data for supervised learning. Not doing opening books

derjack: Marchete afaik you ignore illegal moves and renormalize the others

Marchete: it sounds "simple", but I don't want to touch anything on the training pipeline

Wontonimo: you could take it 1 step further Marchete and use the valid moves to zero out the logits of the illegal moves before softmax is applied

Marchete: you can do that Wontonimo?

Wontonimo: yeah, for sure!

Marchete: I was thinking about a Multiply input

reCurse: You can do anything you want

Marchete: before policy

Marchete: but I don't know if that affects backtracking

Marchete: how's the right way?

reCurse: There's no right way and I'm not trolling

reCurse: Use your intuition, try stuff, measure results

Marchete: :expressionless:

Marchete: my intuition is masking out before softmax

reCurse: So try that

reCurse: I'll just mention zeroing the logits before softmax is a bad idea

reCurse: You want to minus BIG them

Marchete: yeah, softmax is not 0, but -9999999.9

reCurse: To have the intended effect

Marchete: I learned it

reCurse: Not saying it's a good or a bad idea but that sounded wrong so

Marchete: def softmax(xs): http://chat.codingame.com/pastebin/aecb1bf4-16bc-4dfb-be45-7c54586d9221

Marchete: it's true, I can't mask

Marchete: I need to -999999

Wontonimo: right right ... sorry about that. don't zero the logits, my bad.

Marchete: that doesn't break the training pipeline in any way?

reCurse: As long as it's differentiable nothing breaks

Marchete: :thumbsup:

reCurse: Now whether that gives better results or not is another story

MSmits: If you like experimentation such as you would in physics, machine learning is really the perfect field in computer science.

MSmits: so many variables, so many possible experiments :0

Wontonimo: you want the gradient to be blocked entirely for the illegal moves, so multiplying by zero is actually important. The equation could be logit *mask - 1e10*(1-mask)

reCurse: At the risk of repeating myself

reCurse: It's debatable whether blocking the gradient is helpful or not

Wontonimo: i'm debating that it is important ;)

reCurse: There's evidence on both ways /shrug

Wontonimo: but, i agree 100% with experimentation

reCurse: Experimentation is more to do with how unpredictable it is

Marchete: probably isn't helpful, but I'm in a very experimental phase

MSmits: it's even more fun when you can use the results of experimentation on CG. Hope to get there.

reCurse: From one problem to another

reCurse: From one run to another

reCurse: All you have is intuition and past results

Marchete: I can have 90% winrate and 20 generations later it's 4% against that best generation

reCurse: Catastrophic forgetting

Marchete: so it's completely broken for now

reCurse: Lots of ways to deal with that

Marchete: I have no idea

MSmits: yeah, I make to do lists

reCurse: If you have a genuine RPS scenario then it gets hairier

Marchete: it goes 80,60,50,40,....60, and suddenly it starts going to hell

emh: empiricism. hmm. I'm eating gouda with viking onion (chives)

Marchete: to a whopping 4% winrate

reCurse: Could also be gradient explosion

reCurse: So many things

reCurse: Have fun

Marchete: I'm loving it

Marchete: *not*

MSmits: you also have to look at some actual games. Just to make sure you aren't playing the same games over and over

reCurse: I do actually, eh.

Wontonimo: can you adjust your learning rate to be a bit lower and batch size a bit higher?

reCurse: What's not to like about an infinite well of mystery

sprkrd: One thing

Marchete: and epochs Wontonimo?

Marchete: how many per train step?

sprkrd: I have no idea what you're talking about, but there's something wrong with the softmax implementation you pasted earlier, @Marchete

emh: imagine how boring if creativity was conquered by AI

Marchete: because K_BATCH_SIZE=256 K_EPOCHS=20

Marchete: it takes like 5secs on training...

Marchete: and the another generation

Wontonimo: yes, and epochs. another trick is to not train on items that the network already knows really well, it helps decrease forgetting.

Marchete: I got small subsets

MSmits: whats the difference between ordinary forgetting and catastrophic forgetting?

Marchete: tldr: all these hyperparameters are like sorcery

reCurse: There's a method to madness

Marchete: move lr high, then go down, then another damn parameter...

reCurse: Just see it like adjusting constants in a heuristic with batches for testing

reCurse: Exactly the same thing

reCurse: CG trained you for this

MSmits: I guess you start with a very simple network with parameters you used before. Also watch the loss graphs

reCurse: Start with the simplest example that works and go from there

Wontonimo: i don't know of a difference between the two forgettings. perhaps the catastrophic one means total network collapse? idk

MSmits: ye, trying to fit 10 eval params by hand with repeated cg bench is actually a lot less fun than experimenting with NN

Marchete: for me training a generation that has a winrate of 70-90% against a random is like a working example

reCurse: Hmm.

reCurse: Dunno about the game but

reCurse: I'd expect 100% or close to

reCurse: You're giving random player way too much credit

MSmits: for TTT that is almost impossible, but for a longer game with say 30-40 turns, the random player will make many mistakes

MSmits: so then 100% should be possible

Marchete: https://github.com/suragnair/alpha-zero-general

Marchete: experiments graph

reCurse: Yeah not a fan of that repo hehe

Marchete: neither me

MSmits: i did not get that to work

reCurse: I haven't found a single repo I liked tbh

reCurse: So there's that

Marchete: that graph is ridiculous in the sense that they compare against random and greedy

derjack: generally my 1 or 2 generation bets random by >95%

derjack: yeah. if you dont know even any heuristic, compare it to vanilla mcts

Marchete: but if you have a low learning rate

Marchete: and low sample count

Marchete: it simply can't reach high winrates

MSmits: Marchete a generation could be many epochs

MSmits: i dont know how derjack defines it

reCurse: Epoch is ill-defined imo

Marchete: I know, I have 20

reCurse: Number of steps is much better

MSmits: yes reCurse i noticed this

MSmits: but there are steps within steps

reCurse: Um?

reCurse: There's only one kind of step

reCurse: The one that updates your paramters

derjack: 1 generation: self-play N games, 2-3 epochs over replay buffer

MSmits: you have a game generation step and then a learning step

MSmits: the learning step itself might have multiple iterations

MSmits: so steps within steps

emh: wheels within wheels within wheels

reCurse: You're specializing this way too much for me

reCurse: I was just in favor of replacing epochs with steps

reCurse: Which also applies to SL

reCurse: shrug

MSmits: it's better to just always be clear

reCurse: Step is very clear

reCurse: Epoch isn't

MSmits: like derjack just did i mean

MSmits: replacing epoch by step maybe

MSmits: it's like when you run mcts within mcts, you kinda have to clarify which one you're talking about

reCurse: Which makes no sense

Marchete: you convinced me, I'm going to change "epochs" in Tensorflow repo

Marchete: for steps

MSmits: nvm we're saying different thing reCurse

derjack: canadian english vs world english

reCurse: That's very nice of you to assume other canadians understand me fine

MSmits: i was just saying that if you say "learning step". You could be talking about the whole thing that jacek defines as a generation or a step that he defines as an epoch. They can both be called step

MSmits: one is a subdivision of the other

derjack: learning step could also mean learning rate D:

MSmits: yep

reCurse: Am I wrong? I had the impression most of the literature referred to 'steps' when talking about an update of the paramters

RoboStac: epochs is the parameter name in keras for how many times you train on each batch

MSmits: so thats why i am saying, use whole sentences to describe things. with 2 words there's too much chance for miscommunication :)

reCurse: There's a reason why we make new words with new definitions

reCurse: Otherwise communication becomes a burden

MSmits: yes, but if people use them in different ways, they dont help :)

RoboStac: so people who've use keras tend to know it as that

derjack: yeah like someone saying he uses MLP and its not obvious which one is it

reCurse: I was under the impression no one misappropriated the term 'steps'

MSmits: its only not obvious when you do it :P

RoboStac: theres a steps_per_epoch parameter too if you want to make it more confusing :)

MSmits: reCurse let me put it this way, possibly when NN professionals speak of steps they all mean the same thing. But when we discuss it here, I doubt it

reCurse: Well I don't expect much here when searchless bots are continuously referred to as 'heuristic' bots after years and years...

reCurse: Doesn't mean I won't try

MSmits: yeah, the trying itself is exactly what I meant. Explaining things is giving meaning to definitions

MSmits: so keep doing that :0

reCurse: I was genuinely asking if I was wrong in assuming that in the domain

reCurse: I don't really care about the CG meta :P

MSmits: ahh ok

MSmits: what would you call a searchless bot reCurse?

reCurse: A reflex bot if you prefer

MSmits: oh right, i've seen that used in papers

sprkrd: rule-based?

reCurse: Yes

sprkrd: I like rule-based :)

reCurse: Robostac: This is disturbing, I'll have to look this up

MSmits: is "heuristic" just poorly defined or misused?

reCurse: Misused

kovi: in most cases heuristic bot is just greedy

RoboStac: https://keras.io/api/models/model_training_apis/

reCurse: Thanks

reCurse: Ok so 'training steps' then

MSmits: that's very clear

Marchete: but fit also has a epochs

Marchete: no?

reCurse: No

MSmits: so an epoch is training 1 time on the entire dataset and training step is 1 data point?

reCurse: Training step is a single update of the parameters

reCurse: From a backpropagated gradient

MSmits: oh, so it depends on batching

sprkrd: btw, there's a slight undesired effect in the softmax function pasted earlier by Marchete. I guess most of you already know it and probably it's written like that for illustrative purposes, but one should always subtract the max value of the xs vector to prevent float overflow with the exponential function (in case someone happens to use that).

reCurse: Yes

reCurse: sprkrd: Yeah the numerically stable softmax trick

Marchete: that's a simple example

Marchete: I rely on TF's softmax

Monarc: is there a way to put my location on my research i have edge

Marchete: and no, I have no idea

MSmits: Monarc weird question

MSmits: maybe rephrase

Shelby: I'm looking for a tensorflow classic puzzle. Can I have its name?

RoboStac: it got removed

MSmits: dont think we have that

reCurse: An epoch is ill-defined, usually they mean looping on the entire dataset, but it becomes very dependent on other hyperparameters. Like changing batch size has a radical effect on what 'epoch' means and how you can compare it (you don't)

MSmits: because it was python 2 wasnt it?

reCurse: Or what acquiring a bigger dataset means

reCurse: Or how to even define that in the context of infinite data like RL

reCurse: It's annoying IMO

MSmits: reCurse do you mean that if you have 1000 data points and a batch size of 50, you do 20 updates in 1 epoch and if you use batchsize 1000 you do 1000 updates in 1 epoch, which is why its bad to compare?

reCurse: Exactly

MSmits: ok, but that doesnt mean its poorly defined, it's just a bad metric

Shelby: Do you know of a replacement for the puzzle? OI would like to learn.

reCurse: Still poorly defined, how do you use that in RL?

MSmits: useless things can be very well defined

reCurse: How do you define an epoch in RL?

MSmits: the number of times you learn over the full data set

reCurse: What is 'the full dataset'

MSmits: whatever you define it as i suppose

reCurse: ...

reCurse: How is that not ill-defined lol

MSmits: then the full data set is ill defined :P

Default avatar.png bigman69: all my test cases come back positive yet when i submit i dont get 100%. I understand its to prevent hard coding but how am i supposed to know where the error is?

MSmits: i guess by extension...

MSmits: perhaps epoch is a remnant from an early time of RL when people would still always use all their data

reCurse: Or let's say you start from a smaller dataset to test

reCurse: Then you move on to the bigger one

MSmits: as opposed to sampling it or whatnot

reCurse: All of a sudden epoch means very different things

reCurse: Compare the result of an epoch? What?

reCurse: You didn't even do the same number of updates

MSmits: sure, but would learning step not have the same problem? I mean you define it as one backpropagation update, but it also matters how much data is in there

MSmits: i meaning training step

reCurse: No that's batch size

reCurse: That's constant

MSmits: allright, so if you use batch size and training step together, they make a well defined combo

reCurse: Yes

reCurse: Well

MSmits: sure, I'll accept that

reCurse: You forgot data reuse

MSmits: what do you mean

reCurse: How many times a data has been sampled over training

reCurse: Leads to overfit typically

MSmits: hmm, why would someone do that on purpose

reCurse: There's a sweet spot for extracting the maximum usefulness of a piece of data

reCurse: Too little and you're wasteful, too much and you overfit

MSmits: yeah that occurred to me

MSmits: especially if you sample

LuisAFK: how can i open a private chat I accidentally closed?

reCurse: So what I meant is

MSmits: how much do you use the same sample from a bigger set, before taking a new sample

reCurse: Batch size 256, 100 training steps

reCurse: Ok, what if your dataset is 1k vs 10k

reCurse: Can have an impact

reCurse: If you dataset is 1M vs 10M you can probably not care

MSmits: right

reCurse: So those three together make a very stable definition in my mind

reCurse: Disclaimer: Don't forget I'm a clown, not an actual scientist

Default avatar.png Nightzx: hello

MSmits: :clown:

MSmits: the only reason you're not "officially" a scientist is because you dont publish.

MSmits: you know how much crap is published

reCurse: No the actual reason is I didn't take any formal education on that and wouldn't dare to pretend to hold my candle to people actually working the field

reCurse: Don't mistake me for a teacher

MSmits: part of this is imposter effect. I think there are people working in that field that know less than you do. Not everyone is a star in their field :P

reCurse: I'm not being cute I actually mean it

MSmits: I know

MSmits: I thought you did a formal computer science education?

reCurse: CS sure

reCurse: ML, stats, no

MSmits: ahh ok

MSmits: I wonder how often it happens that competitive coders have an idea that is an improvement on existing science, then not share it for competitive reasons

MSmits: it must happen at least sometimes

reCurse: I sometimes get the impression some papers on AI are done on settings that would be equivalent to a gold league bot in terms of performance and whatnot.

MSmits: definitely that

reCurse: Or comparing MCTS results with 100 iterations... what?

MSmits: you can use the ideas, but they are poorly implemented

MSmits: yeah, thats also what i meant by star in there field. Sometimes someone is just going for their PHD with barely any experience and they do actual science, but not very well

MSmits: their field

reCurse: If you go the empirical route you need to at least be in competitive terms

MSmits: agreed

reCurse: Otherwise what credibility does the result have

reCurse: But a lot of time there's not much reference

reCurse: That's why CG is a goldmine in that regard

kovi: i agree. and its a weird contradiction if you consider chess/bitboarding

MSmits: it's partially because knowing a lot about theory of AI, doesn't make you able to code a bitboard using simd

MSmits: or use anything other than python

sprkrd: scientist are not necessarily good coders, I'd say gold league bots enjoy micro-optimizations (e.g. like bitboards) that paper authors don't care to implement

reCurse: kovi: contradiction?

reCurse: Yeah but if you compare your approach empirically using subpar implementations, how am I supposed to believe the results?

sprkrd: and I would say that is fine as long as all the compared alternatives are using the same subpar environment implementations

MSmits: sprkrd i dont entirely agree, for example, minimax works better than mcts with low performance

MSmits: with higher performance, mcts starts to take over

MSmits: depending on the game ofc

MSmits: (and eval quality)

sprkrd: i was thinking more of comparing apples to apples

sprkrd: MCTS variants among them, for instance

kovi: chat slow/dead?

reCurse: Chat is fine on my end

kovi: meh, dice duel

emh: I was thinking about making an educational video on bitboarding. anyone have experience using manim? I tried a bit with manim (software behind 3 blue 1 brown), but now I'm thinking js+html might be easier

MSmits: sprkrd, some mcts variants might work better in one amount of calculation time and then if you add more, the other variant could be better

MSmits: whats manim?

sprkrd: Most papers I've seen about MCTS don't use calculation time as a meaningful metric

sprkrd: they use amount of simulations

kovi: contradiction: for mcts/whatever cg top >= top science for chess bitboard you mentioned that way are beginngers

MSmits: One of these days I am going to show some ways to bitboard with UTTT in a tech.io article I think. Just because thats where most players start

sprkrd: whatever time it takes to reach that many simulations

reCurse: kovi: I don't think I communicated properly again. I meant 'some' papers are way subpar or are working in games with no good reference.

emh: MSmits manim is mathematical animation software

reCurse: In no way did I mean 'top science'

emh: https://github.com/3b1b/manim

MSmits: sprkrd thats better indeed. But even then, you also have to use examples with high rollout counts (> 1M )

reCurse: Chess is also different because it's actually a game that received an enormous amount of competitive attention

kovi: true, without reference platform/game hard to assess value

MSmits: and sometimes it's an unfair comparison. 1 rollout could be more expensive, because mcts variant is heavier, compared to the other one

kovi: yes, that is we are getting better...we have reference platform and pet multis

reCurse: If you compare your approach and you mention in experiment settings '100 iterations for MCTS' I'm closing the tab

reCurse: Stuff like that

MSmits: yeah, but 1k, 10k, 100k, 1M would be ok

MSmits: 1k is not bad if you have very small branching

reCurse: You get the idea

MSmits: ye

sprkrd: Maybe they're 100 high quality iterations :)

sprkrd: Maybe the default policy is a gold league bot on its own

MSmits: even so, you lose too much to exploration

MSmits: mcts just doesnt work with that low iterations

MSmits: minimax always works

Nerchio: offtopic but is there any place where you can check how many legend/gold bots you have

MSmits: yeah CG multi

Nerchio: danke MSmits

MSmits: https://cgmulti.azke.fr/players?p=Nerchio

MSmits: nice job on ooc Nerchio :)

sprkrd: Anywho, I haven't seen any paper so daring as to propose experiments with 100 MCTS iteration :joy:

reCurse: I did more than once :/

reCurse: Sometimes less

sprkrd: That's certainly ridiculous

MSmits: I've seen some crap ones. Not sure if it was 100, but it was bad

sprkrd: Accepted in a reputable conference/journal?

sprkrd: Or more like school work?

reCurse: I don't pay attention to that, maybe I should

MSmits: conference/journal. I dont know their reputations

reCurse: But maybe I would miss a good idea

reCurse: I still appreciate learning from different ideas/approaches, but at some point I'll filter on bad experiments

MSmits: I like the winands ones.

MSmits: not all good, but mostly good reads

reCurse: winands are very good

MSmits: the pseudo code is a bit crappy :P

sprkrd: Sure, you shouldn't skip paper based on the reputation of the venue, but it's certainly more likely to find bad papers on arxiv than on AAAI

MSmits: but at least they give code :)

reCurse: I don't know, the vast majority of my reading is on arxiv

reCurse: Found very very good stuff there

reCurse: I'd rather not pay publishers who exploit researchers

sprkrd: Mind you, some of the good stuff on arxiv have actually been published somewhere else, and they're on arxiv so the paper is publicly available

sprkrd: While there are certainly ethic issues in academia, peer review is almost always a good thing

reCurse: The gold nugget is finding the paper review on openreview

reCurse: If all papers had that it would be amazing

sprkrd: Shameless (and off-topic) plug: https://www.youtube.com/watch?v=8n8mF-RmsNs (paper available on description :))

reCurse: Frame rate makes it a bit difficult to judge

sprkrd: the good stuff is the bloopers at the end

reCurse: It's cool though

sprkrd: computer was running like lava while I was recording this

sprkrd: I'm thankful to have footage at all

reCurse: Hehe yeah, it's just animation at 10fps is difficult to see

kovi: no flying or underground?

sprkrd: crawling :)

sprkrd: flying and underground I fixed at the very beginning

sprkrd: Too many flying darwins

kovi: sommersault or cartwheel

kovi: ai finding the holes or bending the rules is fun

sprkrd: Oh, when I ran the thing on the real robot, instead of walking forward, it did the moonwalking

reCurse: Nice

sprkrd: Didn't have time to fix that, but it was pretty funny

sprkrd: https://www.youtube.com/watch?v=lOaWvOA9cb4

reCurse: Cool robot

jacek: oh my

StevensGino: isn't it from 2017?

CamTheHelpDesk: since the puzzle this week is bot programming, what counts as solving it for the homepage path?

RoboStac: getting out of the initial league

LuisAFK: https://www.codingame.com/ide/puzzle/hidden-messages-in-images

Shelby: http://chat.codingame.com/pastebin/cf7931f2-0aff-46e6-ba67-9423d51c9574

Shelby: ... That's frustrating...

Shelby: I was talking about how it sucks to get thrown into Bronze when I've never tested my code in Wood 1.

Shelby: Or even got to read any of the Wood 1 instructions.

Shelby: Happens almost every contest I join.

reCurse: You should just be happy to be out of the woods imo

reCurse: Focus on the real game rather than made up ones

Westicles: This is what the kids call a humble brag

Shelby: Yeah, but I don't get a chance to look at any of the rules introduced in Wood 1, and I don't get to play with it.

reCurse: I guess, but you usually don't get much usefulness out of it once you promote still

reCurse: In my experience anyway

FrancoRoura: All games are made up in the end, I also think it'd be fun to have one more game to play with

Shelby: I've never breached past Bronze (at least not that I know of).

reCurse: Sure, I meant 'made up' in the sense that you made the actual one, then you just sort of butcher it in an attempt to be more beginner friendly

Shelby: I always get stuck when I don't get to do anything with Wood 1.

reCurse: Here's chess, oh but we're going to start with only pawns

jacek: shameless advertising

olaf_surgut: does making some kind of skip lists in CvZ for dead humans/zombies makes sensible speedup

olaf_surgut: ?

jacek: but that could also mean breakthrough advertising :thinking:

reCurse: All intended :P

Shelby: I don't even know what rules are added between Wood 2 and Wood 1, versus if any rules are added between Wood 1 and Bronze.

jacek: check out the referee code

Westicles: Shelby, part of it is the superior education system in the US. Whatever code you throw together will go far

Shelby: lmao

Shelby: Superior education system = my mother

Shelby: She didn't teach me everything I know. Not by a longshot. But she did teach me the basics of coding logic when I was a very small child, and she taught me how to google as I grew up.

reCurse: Taught you how to google? You're already ahead the average

Shelby: She went to online university for web design when I was 8, so I got a major dose of college education while in elementary school.

jacek: she taught you javascript? thats child abuse!

Default avatar.png benispro:


Shelby: Actually, the vbScript with classic ASP might be a lot closer to child abuse. That code is nasty.

Shelby: note: I do _not_ mean vb.net or asp.net.

reCurse: I don't know, people still code in PHP

reCurse: Isn't that the same thing? :P

Shelby: Javascript and PHP 5 are pretty similar. PHP does a weird thing where you have to explicitly pass it by reference, or it copies the values. And you have to prefix variables. Besides that, the syntax and data types are pretty similar. It was a shock to go from mostly JS and PHP over to C# and Java, where data maps aren't first-class.

reCurse: I was comparing PHP to classical ASP

Shelby: http://chat.codingame.com/pastebin/fb783e1f-f530-47b4-99df-50d504fc3f18

Shelby: ...

jacek: oO

Shelby: C# is super nice. PHP is okay. Java is torture. vbScript... you might as well be using Bash for your web server.

Nerchio: you are torture

Nerchio: :grin:

Scarfield: "no offence" :p

Default avatar.png ikustom: non taken.

Default avatar.png Xzoky174: I'm stuck on level 6 for so long..

jacek: :no_mouth:

ErrorCookie: I am doing a challenge where I need to find exits of a maze. I came up with a recursive function that calls itself for every possible direction and from those possible directions again for each possible directions and so on. Is there a "better" alogirthm for this?

ErrorCookie: *algorithm

jacek: DFS?

ErrorCookie: Ok thanks .(^_^).

jacek: what puzzle

ErrorCookie: Medium: Maze

jacek: so yeah, DFS and/or BFS would be good

jacek: Space maze. Community success rate: 5% :thinking:

Default avatar.png JBM: SUSPICIOUS

Scarfield: Member of the 5% ?

sprkrd: Have 20 people tried and one of them is the author?

Default avatar.png Jaminrock: lol

Default avatar.png Jaminrock: the hypnotoad himself

Westicles: I highly recommend playing space maze manually. Much more fun than writing some boring program

jacek: do you do that with csb as well

Westicles: :thinking:

Default avatar.png Zimtime907: Hi Yall!

Gers2017: bbbbbbb

sprkrd: "some boring program"

sprkrd: better write a fun program, no?

jacek: functional?

Psuedo: Is it normal to struggle with these problems but in projects I rarely find myself struggling?

Psuedo: I think a lot of what I'm facing is trouble understanding the problem, as well.

jacek: puzzles here or most stuff here are unlike anything i do at work

AllYourTrees: are ppl testing out NNs in the connect 4 thing?

jacek: dunno, i dont. i use good old N-tuple

AllYourTrees: what's N-tuple?

AntiSquid: the *thing*

jacek: patterns

jacek: https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.108.5111&rep=rep1&type=pdf

Default avatar.png DavidOfEarth: 75

Default avatar.png YoloTheBear: This 1D spreadsheet my code keeps failing on the last test Deep Birecursion idk how I could make recursion more optimized it just times out

therealbeef: timeouts are sometimes causes by crashes

therealbeef: s=d

Default avatar.png DUNKEN: Has anyone tried this out:

Default avatar.png DUNKEN: https://www.codingame.com/multiplayer/bot-programming/connect-4

Bob23: hello world!

Smelty: :eyes:

StepBack13: DUNKEN, trying it now. diagonals are so hard to stop!

Default avatar.png JL07: i'm op

Default avatar.png adradr: cout<<memekbau

Default avatar.png JL07: cout << "memekbau";

JimmyJams: reverse code clashes should validate against the hidden tests before submitting. Otherwise we have no idea if our solution is the right one.

JimmyJams: they don't need to show the hidden ones, just tell us if they passed or failed

Smelty: hmm

Slavvy: bro how do I continue

TNtube: JimmyJams i approuve this

TNtube: it's really frustrating when all your tests passes but finally the answer is totally an other algo

Smelty: yep i agree

Smelty: might want to just add more test cases for reverse mode though